<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Taras Mankovski Blog &#187; Taras</title>
	<atom:link href="http://taras.cc/index.php/author/admin/feed/" rel="self" type="application/rss+xml" />
	<link>http://taras.cc</link>
	<description>Building Beecoop and generaly making developer&#039;s lives easier and more productive.</description>
	<lastBuildDate>Wed, 25 Nov 2009 03:41:59 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>How to deploy: database, source and binary changes in 1 patch?</title>
		<link>http://taras.cc/index.php/2009/11/24/how-to-deploy-database-source-and-binary-changes-in-1-patch/</link>
		<comments>http://taras.cc/index.php/2009/11/24/how-to-deploy-database-source-and-binary-changes-in-1-patch/#comments</comments>
		<pubDate>Wed, 25 Nov 2009 03:40:15 +0000</pubDate>
		<dc:creator>Taras</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[CMS]]></category>
		<category><![CDATA[Drupal]]></category>
		<category><![CDATA[Joomla]]></category>
		<category><![CDATA[Tools]]></category>

		<guid isPermaLink="false">http://taras.cc/?p=183</guid>
		<description><![CDATA[Today, was an exciting day for me, again.  
I was working on several client&#8217;s projects and something seems to have crystalized in my brain today. This crystallization created a new question that I posted on stackoverflow: How to deploy: database, source and binary changes in 1 patch?
Amazingly, 6 hours went by and I only [...]]]></description>
			<content:encoded><![CDATA[<p>Today, was an exciting day for me, again. <img src='http://taras.cc/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>I was working on several client&#8217;s projects and something seems to have crystalized in my brain today. This crystallization created a new question that I posted on stackoverflow: <a href="http://stackoverflow.com/questions/1791426/how-to-deploy-database-source-and-binary-changes-in-1-patch">How to deploy: database, source and binary changes in 1 patch?</a></p>
<p>Amazingly, 6 hours went by and I only got 20 views and 0 answers. You know what this says to me? To me this says that people do not have an easy answer to this question. I know for myself, i usually look for questions that I have quick answers to, I think this is not one of those questions.</p>
<p>Well, this actually was a test to see if something like this already exists, because I think I have an answer to this question. I outlined it in my conversation with Trevor today. I put it to your for consideration, I would love to get feedback so please tell me what you think.</p>
<p>A transcript of this conversation is in this post:</p>
<p><span id="more-183"></span></p>
<blockquote><p><strong>Taras Mankovski</strong></p>
<p>I have multiple developers checking out projects from repository, the repo contains db dump of the project, they import the db, make changes to code, dump db, commit changes and db back to repo, problem is that sql dump site is pretty much useless in terms of figuring out what has changed in db</p>
<p>diff, is a well established mechanism for comparing text files. It works pefectly. I can at any point, check what&#8217;s changed between revision of a project. This is especially nice with git because I can choose arbitrary commits and see their changes.</p>
<p>This brings us back to db. DB sql dump is pretty much useless in determining what has changed in the DB because of the nature of the file. It&#8217;s long it&#8217;s wordy, it plain sucks to read.</p>
<p>So here is the solution:</p>
<p>I think it would be possible to represent the database as a directory with files in it. Every table in the database would be a directory. Every record a text file that contains in it the data from the record, represented as JSON.</p>
<p>What this means is that it would be extremely easy to commit the database into the repository and see what has changed.</p>
<p>We can use existing established mechanisms to determine the changes in the database.</p>
<p>The information included in the file system can be quiet extensive. For example, every directory can contain a schema file, that represents the schema of the table as json file. Again, it would be very easy to compare schema changes from revision to revision by comparing the schema files.</p></blockquote>
<blockquote><p><strong>Trevor Twining</strong><br />
I think it&#8217;s a really interesting solution to the problem, just trying to work out how implementation details would work with Drupal&#8217;s architecture and how it might support or conflict with other partial solutions to the problem.</p>
<p>There&#8217;s been a push in drupal lately to allow more config to be exported as code for just that reason, but a lot of it is happening at the object level. There&#8217;s a growing number of modules that are using OOP principles in their code even though Drupal is procedural, so they&#8217;re basically providing var exports for some of those objects some of that work is for reusability of code as well, so that configurations can be saved and implemented in different projects</p></blockquote>
<blockquote><p><strong>Taras Mankovski</strong><br />
Ok, so there is more. The title of this wave is Unified Patching. The problem with what we have right now is that &#8220;It&#8217;s very difficult to deploy changes to the server.&#8221;</p>
<p>The problem actually is that deployment is easy, if you&#8217;re doing it wrong. Deployment is hard if you&#8217;re doing it right.</p>
<p>What do I mean by that?<br />
It&#8217;s easy to make changes to client&#8217;s site by working directly on client site. That in many many many ways is very wrong. If you break something you&#8217;re essentially screwed, it looks bad and it&#8217;s just extremely unprofessional. But it&#8217;s so damn easy.</p>
<p>The hard part is to do it right &#8211; do all development offline, test the changes, show the changes to the client on demo server, stage the changes and then finally apply them to client&#8217;s site without impacting the live site. This is hard.</p>
<p>I&#8217;m finding that deployment takes as much time as development. This is what I want to change. This is where Unified Patching will come in.</p>
<p>I&#8217;m breaking up project changes into 4 categories.</p>
<ul>
<li>Database Structure</li>
<li>Database Data</li>
<li>Source Code</li>
<li>Binary Files</li>
</ul>
<p>Actually, when working with CMS, Database Data can be broken up into 2 categories: Content Changes and Structural Changes. In Drupal, this would be defining the content types and the actual content data. We will ignore this point for now.</p>
<p>I think it would be possible to systematically apply all 4 of these changes using a unified tool, that would apply them in a reliable and predictable manner.</p>
<p>What&#8217;s I&#8217;m talking about is unix &#8216;patch&#8217; utitility on steroids.</p>
<p>I have an idea of how this could be done in a very simple and reliable way.</p>
<p>Ok, so I think it would be possible to use diff to generate Unified Patches.</p>
<p>Diff is a very simple format that can be generated from any VCS.</p>
<p>Git&#8217;s diff includes information about the changes that happened between version with source code, binary files and permissions. It has no capacity to handle database changes.</p>
<p>I think if we were to use the solution outlined above, then we could create a Unified Patch generator that would parse the diff to determine changes in the 4 data types that were outlined above.</p>
<p>For example:</p>
<p>Let&#8217;s assume that we have all of the database dumpted as files onto the files system. We are going to run DIFF between 2 revisions.</p>
<ol>
<li>Parse the diff to determine if any of the schema files have changed in the /db directory have changed. If changed then generate ALTER statements that correspond to the changes made to the schema.</li>
<li>Parse the diff to determine if any files in the /db directory have changed, if they have then create a corresponding REPLACE or DELETE/INSERT statement for each changed file</li>
<li>If binary files were added/removed or modified, then include the modifed files a temprorary location</li>
<li>Include standard patch for source code or text files</li>
</ol>
<p>Now that we have all of the data types processed, put them together into 1 directory and tar.gz the directory.</p>
<p>Now you have an Unified Patch.</p>
<ol>
<li>Upload the patch to the server.</li>
<li>run upatch ourpatch.tar.gz</li>
</ol>
<p>upatch will perform the following actions.</p>
<ol>
<li>dry-run database schema changes (possibly inside of a trasaction)</li>
<li>dry-run database content import</li>
<li>dry-run binary file move</li>
<li>dry-run patch</li>
</ol>
<p>I don&#8217;t know how to do step 1 and 2. Step 3 could be fairily simple, we just need to know if have permissions to overwrite these files.</p>
<p>If dry-run succeds, then it&#8217;s fairily safe to perform live update.</p></blockquote>
<blockquote><p><strong>Trevor Twining</strong><br />
I don&#8217;t know if you could do a dry run, but you could make changes in a test version (makes a dump of current live db and changes that instead of live)</p>
<p>transactional support might allow you to do the rollback, but you&#8217;d need to be able to do several operations before the commit, shouldn&#8217;t be a problem, but dummy database might be more useful because you can connect to it and actually see if anything is messed up, which you couldn&#8217;t do with a transactional approach.</p></blockquote>
<blockquote><p><strong>Taras Mankovski</strong><br />
We really just want to know if the query succeeded. The other approach would be use to test, like you said, but actual make it a staging environment where you can push the changes to and test if the site is working before you apply the same change to the live site</p></blockquote>
<blockquote><p><strong>Trevor Twining</strong><br />
Here&#8217;s a question though, what about the cases where the database is storing some serialized version of a code-object for persistance? Not looking for an immediate answer, but it&#8217;s an important question</p></blockquote>
<blockquote><p><strong>Taras Mankovski</strong><br />
Is the data stored as php serialized object?</p></blockquote>
<blockquote><p><strong>Trevor Twining</strong><br />
a portion, yes, and it&#8217;s how code is shared/exported, basically via a var_export call. code for views can then be moved out of the database and stored as modules</p></blockquote>
<blockquote><p><strong>Taras Mankovski</strong><br />
ok, let&#8217;s take a few example:<br />
Multiple developers are working on same site<br />
Developer A checked out changes from repo, made changes in the view and commited them.<br />
At the same time, Developer B also made changes to the db and try to commit them. Now, for Developer B, his commit would fail because his dbfiles are out of date. He needs to perform merge. I do not know how you would merge 2 view exports or if it&#8217;s at all possible, but atleast the developers know that there is a conflict in the views right away.</p></blockquote>
<p>That&#8217;s it for today, not bad for 1 days, work.</p>
<p>Ok, what do you guys think about this?</p>
]]></content:encoded>
			<wfw:commentRss>http://taras.cc/index.php/2009/11/24/how-to-deploy-database-source-and-binary-changes-in-1-patch/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Announcing a new project called: Social Photography</title>
		<link>http://taras.cc/index.php/2009/11/18/announcing-a-new-project-called-social-photography/</link>
		<comments>http://taras.cc/index.php/2009/11/18/announcing-a-new-project-called-social-photography/#comments</comments>
		<pubDate>Wed, 18 Nov 2009 20:14:51 +0000</pubDate>
		<dc:creator>Taras</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://taras.cc/?p=180</guid>
		<description><![CDATA[Some of you know that in my spare time I&#8217;ve been taking pictures of people at parties.
Well, I&#8217;m making it official. I registered socialphotography.org, created Social Photography Facebook Group and socialpics twitter account.
The concept is very simple. I take portrait of people at parties, on their way to parties and generally just having fun. It&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>Some of you know that in my spare time I&#8217;ve been taking pictures of people at parties.</p>
<p>Well, I&#8217;m making it official. I registered <a href="http://socialphotography.org">socialphotography.org</a>, created <a href="http://www.facebook.com/group.php?gid=182023890667">Social Photography Facebook Group</a> and <a href="http://twitter.com/socialpics">socialpics</a> twitter account.</p>
<p>The concept is very simple. I take portrait of people at parties, on their way to parties and generally just having fun. It&#8217;s usually a very social experience. I really enjoy the social aspect of the pictures, that&#8217;s why for me it&#8217;s &#8211; &#8220;Social first, Photography second.&#8221;</p>
<p>I see you later guys!</p>
]]></content:encoded>
			<wfw:commentRss>http://taras.cc/index.php/2009/11/18/announcing-a-new-project-called-social-photography/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Collaboration is hot again!</title>
		<link>http://taras.cc/index.php/2009/11/04/collaboration-is-hot-again/</link>
		<comments>http://taras.cc/index.php/2009/11/04/collaboration-is-hot-again/#comments</comments>
		<pubDate>Wed, 04 Nov 2009 06:37:28 +0000</pubDate>
		<dc:creator>Taras</dc:creator>
				<category><![CDATA[Collaboration]]></category>

		<guid isPermaLink="false">http://taras.cc/?p=177</guid>
		<description><![CDATA[I was at Cascon 2009 today and collaboration seems to be a hot topic again. Within the last 24 hours, I met 2 new people who are interested in a collaboration platforms.
First, is a Professor from the University of Toronto, who is starting to do research on collaboration tools and why people collaborate.
Second, a techie [...]]]></description>
			<content:encoded><![CDATA[<p>I was at Cascon 2009 today and collaboration seems to be a hot topic again. Within the last 24 hours, I met 2 new people who are interested in a collaboration platforms.<span id="more-177"></span></p>
<p>First, is a Professor from the University of Toronto, who is starting to do research on collaboration tools and why people collaborate.</p>
<p>Second, a techie from Vacouver who is interested in a collaboration system for a new Vacouver HUB that his groups is working on.</p>
<p>I&#8217;m really excited about this development, because Josh and I have been beating the collaboration drum for 3 years now and it looks like people are starting to respond.</p>
]]></content:encoded>
			<wfw:commentRss>http://taras.cc/index.php/2009/11/04/collaboration-is-hot-again/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>3scale.net integration is complete</title>
		<link>http://taras.cc/index.php/2009/10/10/3scale-net-integration-is-complete/</link>
		<comments>http://taras.cc/index.php/2009/10/10/3scale-net-integration-is-complete/#comments</comments>
		<pubDate>Sat, 10 Oct 2009 22:46:26 +0000</pubDate>
		<dc:creator>Taras</dc:creator>
				<category><![CDATA[Status Update]]></category>

		<guid isPermaLink="false">http://taras.cc/?p=175</guid>
		<description><![CDATA[I&#8217;ve now integrated 3scale.net into our Datastore API.
Now can I control and meter access to the Datastore API using 3scale.
Next, I&#8217;m moving on to RabbitMQ and Celery, the goal is to be able to create tasks that are executed by offline worker servers.
]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve now integrated 3scale.net into our Datastore API.</p>
<p>Now can I control and meter access to the Datastore API using 3scale.</p>
<p>Next, I&#8217;m moving on to RabbitMQ and Celery, the goal is to be able to create tasks that are executed by offline worker servers.</p>
]]></content:encoded>
			<wfw:commentRss>http://taras.cc/index.php/2009/10/10/3scale-net-integration-is-complete/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Testing REST API with restclient</title>
		<link>http://taras.cc/index.php/2009/10/07/testing-rest-api-with-restclient/</link>
		<comments>http://taras.cc/index.php/2009/10/07/testing-rest-api-with-restclient/#comments</comments>
		<pubDate>Wed, 07 Oct 2009 20:58:34 +0000</pubDate>
		<dc:creator>Taras</dc:creator>
				<category><![CDATA[API]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Testing]]></category>

		<guid isPermaLink="false">http://taras.cc/?p=166</guid>
		<description><![CDATA[Here are some code snippets from the test that I wrote for testing the REST API for one of our projects.
I am using appengine-rest-server as API provider on the server.
Do not copy these example verbatim, I&#8217;m only showing them as examples.
Each of this methods is defined inside of a class
Imports

from nose.tools import assert_equals, assert_true, assert_false
from [...]]]></description>
			<content:encoded><![CDATA[<p>Here are some code snippets from the test that I wrote for testing the REST API for one of our projects.<span id="more-166"></span><br />
I am using <a href="http://code.google.com/p/appengine-rest-server/">appengine-rest-server</a> as API provider on the server.</p>
<p>Do not copy these example verbatim, I&#8217;m only showing them as examples.<br />
Each of this methods is defined inside of a class</p>
<h4>Imports</h4>
<pre><code>
from nose.tools import assert_equals, assert_true, assert_false
from restclient import GET, POST, PUT, DELETE
from lxml import etree, objectify
from StringIO import StringIO
</pre>
<p></code></p>
<h4>Verifying XML against XML Schema</h4>
<pre><code>
    def verify_against_schema(self, xml, schema):
        '''
        Check if data is valid for provided schema

        xml data to verify
        schema is url to the xml schema
        '''
        schema = etree.XMLSchema(etree.parse(StringIO(schema)))
        parser = etree.XMLParser(schema=schema)
        return etree.parse(StringIO(xml), parser=parser)

    def test_review_schema(self):
        'Vacation :: Verifying review schema'

        xml = '''<?xml version="1.0" encoding="utf-8"?>
        <Review>
            <service>4</service>
            <cleanliness>4</cleanliness>
            <reason>Wedding</reason>
            <value>4</value>
            <resort>34r3423f3rf32f3f</resort>
            <ta_id>334323443</ta_id>
            <rooms>300</rooms>
            <recommend>true</recommend>
            <date>2009-10-07</date>
            <type>Family</type>
            <location>4</location>
        </Review>
        '''

        verified = self.verify_against_schema(xml, GET(self.url('/api/rest/metadata/Review')))
        assert_true(verified)

</code></pre>
<h4>Creating an item</h4>
<pre><code>

    def create_review(self, resort, full=False):
        'Create review for testing purposes'

        xml = '''<?xml version="1.0" encoding="utf-8"?>
        <Review>
            <service>4</service>
            <cleanliness>4</cleanliness>
            <reason>Wedding</reason>
            <value>4</value>
            <resort>%s</resort>
            <ta_id>334323443</ta_id>
            <rooms>300</rooms>
            <recommend>1</recommend>
            <date>2009-10-07</date>
            <type>Family</type>
            <location>4</location>
        </Review>
        '''%resort

        if full:
            url = '/api/rest/Review?type=full'
        else:
            url = '/api/rest/Review'     

        return POST(self.url(url), body=xml, async=False, resp=True)
</code></pre>
<h4>Updating an item</h4>
<pre><code>
   def test_updating_review(self):
        'Vacation :: Updating review'

        response, resort = self.create_resort()
        response, review_id = self.create_review(resort)

        xml = '''<?xml version="1.0" encoding="utf-8"?>
        <Review>
            <service>5</service>
            <cleanliness>5</cleanliness>
            <reason>Wedding</reason>
            <value>5</value>
            <resort>%s</resort>
            <ta_id>334323443</ta_id>
            <rooms>520</rooms>
            <recommend>true</recommend>
            <date>2009-10-07</date>
            <type>Family</type>
            <location>4</location>
        </Review>
        '''%resort
        response, content = PUT(self.url('/api/rest/Review/%s?type=full'%review_id), body=xml, resp=True, async=False )

        expected = '''<?xml version="1.0" encoding="utf-8"?>
        <Review>
            <key>%s</key>
            <service>5</service>
            <cleanliness>5</cleanliness>
            <reason>Wedding</reason>
            <value>5</value>
            <resort>%s</resort>
            <ta_id>334323443</ta_id>
            <rooms>520</rooms>
            <recommend>true</recommend>
            <date>2009-10-07</date>
            <type>Family</type>
            <location>4</location>
        </Review>
        '''%(review_id,resort)

        expected = objectify.fromstring(expected)
        expected = etree.tostring(expected)

        result = objectify.fromstring(content)
        result = etree.tostring(result)

        assert_equals(expected, result)
</code></pre>
<p>I hope you find these helpful</p>
]]></content:encoded>
			<wfw:commentRss>http://taras.cc/index.php/2009/10/07/testing-rest-api-with-restclient/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>API Testing Code is ready</title>
		<link>http://taras.cc/index.php/2009/10/07/api-testing-code-is-ready/</link>
		<comments>http://taras.cc/index.php/2009/10/07/api-testing-code-is-ready/#comments</comments>
		<pubDate>Wed, 07 Oct 2009 20:46:32 +0000</pubDate>
		<dc:creator>Taras</dc:creator>
				<category><![CDATA[Status Update]]></category>

		<guid isPermaLink="false">http://taras.cc/?p=164</guid>
		<description><![CDATA[Finally, I have working tests for the Vacation API. There were several things that I needed to overcome to accomplish this.

Add ability to POST body to restclient &#8211; created patch and contributed it to the project
Write a bunch of tests

Done, now I can move on to integrating 3scale.net into the API Server. 
]]></description>
			<content:encoded><![CDATA[<p>Finally, I have working tests for the Vacation API. There were several things that I needed to overcome to accomplish this.</p>
<ol>
<li>Add ability to POST body to restclient &#8211; <a href="http://code.google.com/p/microapps/issues/detail?id=5#c2">created patch and contributed it to the project</a></li>
<li>Write a bunch of tests</li>
</ol>
<p>Done, now I can move on to integrating <a href="http://3scale.net/">3scale.net</a> into the API Server. </p>
]]></content:encoded>
			<wfw:commentRss>http://taras.cc/index.php/2009/10/07/api-testing-code-is-ready/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>September 30th, 2009 -&gt; October 14th, 2009</title>
		<link>http://taras.cc/index.php/2009/09/30/september-30th-2009-october-14th-2009/</link>
		<comments>http://taras.cc/index.php/2009/09/30/september-30th-2009-october-14th-2009/#comments</comments>
		<pubDate>Wed, 30 Sep 2009 20:48:04 +0000</pubDate>
		<dc:creator>Taras</dc:creator>
				<category><![CDATA[How I'll roll]]></category>

		<guid isPermaLink="false">http://taras.cc/?p=162</guid>
		<description><![CDATA[The next 2 weeks are dedicated to finishing the development of the Data Store on the Google App Engine. I want to have a solid testing framework setup for the Data Store REST API before I integrate 3scale. This will allow me to ensure that the API is working perfectly without having to manually test [...]]]></description>
			<content:encoded><![CDATA[<p>The next 2 weeks are dedicated to finishing the development of the Data Store on the Google App Engine. I want to have a solid testing framework setup for the Data Store REST API before I integrate 3scale. This will allow me to ensure that the API is working perfectly without having to manually test everything. This will also make it possible to do regression testing.</p>
<p>Last week, I got distracted trying to eliminate errors that dev_server.py was generating because it was trying to include packages from site-packages that it did not require for the project. I attempted to resolve this issue by creating a buildout for GAE projects that I could use in combination with virtualenv &#8211;no-site-packages configuration. This did not work because GAE was attempting to load packages from system&#8217;s python instead of looking in virtualenv directory. I&#8217;m not going to bother trying to resolve this anymore, I&#8217;ll just have to live with the errors.</p>
<p>Ok, on I go.</p>
]]></content:encoded>
			<wfw:commentRss>http://taras.cc/index.php/2009/09/30/september-30th-2009-october-14th-2009/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Architecture of our Infrastructure</title>
		<link>http://taras.cc/index.php/2009/09/30/architecture-of-our-infrastructure/</link>
		<comments>http://taras.cc/index.php/2009/09/30/architecture-of-our-infrastructure/#comments</comments>
		<pubDate>Wed, 30 Sep 2009 20:29:16 +0000</pubDate>
		<dc:creator>Taras</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Beecoop]]></category>

		<guid isPermaLink="false">http://taras.cc/?p=157</guid>
		<description><![CDATA[My main goal is to create a system that will scale easily and allow us to use commodity infrastructure elements. These commodity infrastructure elements each have their advantages and disadvantages. The architecture of the system aims to utilize the advantages and minimize the disadvantages.Here is a list different functions performed by our system.

Public User Interface
Collection
Processing
Storing

Public [...]]]></description>
			<content:encoded><![CDATA[<p>My main goal is to create a system that will scale easily and allow us to use commodity infrastructure elements. These commodity infrastructure elements each have their advantages and disadvantages. The architecture of the system aims to utilize the advantages and minimize the disadvantages.<span id="more-157"></span>Here is a list different functions performed by our system.</p>
<ul>
<li>Public User Interface</li>
<li>Collection</li>
<li>Processing</li>
<li>Storing</li>
</ul>
<h4>Public User Interface</h4>
<p>Running on <a href="http://www.cherokee-project.com/">Cherokee Web Server</a>. Cherokee is a high efficiency web server that has low requirements and can handle very high loads. This element can be served sufficiently well by an inexpensive virtualized web server running in the cloud. </p>
<h4>Collection</h4>
<p>Collection element of the system will be performed by <a href="http://scrapy.org/">scrapy</a> spiders which will get their tasks from the Task Server and store the result in the Beecoop Data Store.</p>
<h4>Processing</h4>
<p>Processing is done by workers that, like Collection, get their tasks from the Task Server and either store their data in Beecoop Data Store or PUI database.</p>
<p>Both Processing and Collection require a lot of RAM and CPU. We&#8217;re going to attempt to use commodity hardware to fulfill these functions. Core Networks has inexpensive hardware that we can try to use. If this fails then we can use Amazon EC2.</p>
<h4>Storing</h4>
<p>The collection element will capture a lot of different kinds of data. In our testing, 1 week of execution, we collected over 500,000 units of data. To handle this volume of information we&#8217;re going to utilize Google Data Store, because it will give us a huge database that we will not have to scale ourselves. </p>
<h3>Visual Representation of our Infrastructure</h3>
<p><img src="http://taras.cc/wp-content/uploads/2009/09/architecture1.jpg" alt="architecture.jpg" border="0" width="693" height="512" /></p>
]]></content:encoded>
			<wfw:commentRss>http://taras.cc/index.php/2009/09/30/architecture-of-our-infrastructure/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Pocket Retro Game Emulator</title>
		<link>http://taras.cc/index.php/2009/09/30/pocket-retro-game-emulator/</link>
		<comments>http://taras.cc/index.php/2009/09/30/pocket-retro-game-emulator/#comments</comments>
		<pubDate>Wed, 30 Sep 2009 15:30:58 +0000</pubDate>
		<dc:creator>Taras</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://taras.cc/?p=132</guid>
		<description><![CDATA[I want one of these:



]]></description>
			<content:encoded><![CDATA[<p>I want one of these:<br />
<a href="http://www.thinkgeek.com/electronics/retro-gaming/bd6f/?cpg=cj"><br />
<img src="http://taras.cc/wp-content/uploads/2009/09/bd6f_pocket_retro_game_emulator.jpg" alt="bd6f_pocket_retro_game_emulator.jpg" border="0" width="400" height="309" /><br />
</a></p>
]]></content:encoded>
			<wfw:commentRss>http://taras.cc/index.php/2009/09/30/pocket-retro-game-emulator/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Building python development environment</title>
		<link>http://taras.cc/index.php/2009/09/30/building-python-development-environment/</link>
		<comments>http://taras.cc/index.php/2009/09/30/building-python-development-environment/#comments</comments>
		<pubDate>Wed, 30 Sep 2009 15:30:52 +0000</pubDate>
		<dc:creator>Taras</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Beecoop]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[open]]></category>

		<guid isPermaLink="false">http://taras.cc/?p=130</guid>
		<description><![CDATA[I&#8217;m preparing to start development on beecoop and anonymous websites and I would like to establish a comfortable python development environment to make development and deployment of different applications easier.
There are several factors that I need to consider:

I&#8217;m going to be working on several different projects at the same time.
Each project is going to be [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m preparing to start development on beecoop and anonymous websites and I would like to establish a comfortable python development environment to make development and deployment of different applications easier.</p>
<p>There are several factors that I need to consider:</p>
<ul>
<li>I&#8217;m going to be working on several different projects at the same time.</li>
<li>Each project is going to be using different python modules.</li>
<li>Some projects will run on apache, others on cherokee.</li>
<li>Cherokee does not have installer for Mac OS X, so I can not replicate our production environment locally.</li>
<li>I would like to start working in TDD</li>
<li>Last and not least, I have to make it easy for our sysadmin to deploy these projects.</li>
</ul>
<p><span id="more-130"></span>Here are technologies that I&#8217;ve found so far and how I would like to combine them together.</p>
<p><a href="http://pypi.python.org/pypi/virtualenv">Virtualenv</a> with <i>&#8211;no-site-packages</i> switch allows you to create virtual environments that are seperate from your system. This forces you to be conscious of what packages your application is using. This is particularly useful when you&#8217;re doing development of applications that require different versions of python. We&#8217;re going to stick to python2.5 for the near future so this is not a major concern for us.</p>
<p><a href="http://pypi.python.org/pypi/zc.buildout">Buildout</a> is the tool that i think we&#8217;re actually going to be using. It makes it easy for sysadmin to deploy the applications, while maintaining the benefit of forcing us to be conscious of what packages particular application requires. Buildout makes virtualenv unnecessary for us because it simulates virtualenv&#8217;s functionality by including the paths for all necessary packages in sys.path of every project.</p>
<p><a href="http://pypi.python.org/pypi/PasteScript">Paste Script</a> allows to create packages that we can include inside of our applications. It makes it easy to integrate multiple modules that are under development into our applications while keeping things clean and modular.</p>
]]></content:encoded>
			<wfw:commentRss>http://taras.cc/index.php/2009/09/30/building-python-development-environment/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
