These days I'm really busy with working at FlxOne, and also developing the new CloudPelican project. It solves problems related to non-centralized, non-accessible data. CloudPelican makes it searchable in real-time with a good looking interface.
Today is the day, that my first external blog post was published. It explains why online marketers should leverage all the data available. Size does matter! I think you should read it when you are interested in big data, marketing or the combination of these.
How to: automated transition from Amazon S3 to Glacier
Since not so long Amazon has the Glacier service. This is a tape-robot-like system which is even cheaper for archiving your data. It has pros and cons. All together it's really simple, if you don not expect to need access to the data again, but want to store it for archiving, put it into Glacier. You can get it back from Glacier, but it's slow (in terms of hours) and expensive.
So, my data is in Amazon S3, how to automatically move it to Glacier? Just follow me!
1. Login to your Amazon Webservices Console
2. Go to the S3 service page (the one that lists your buckets)
3. Now select your bucket and click the "Properties" button in the top left
4. Unfold the "Lifecycle" tab
5. Now you can add a new rule, click on the button
6. Click the "Add transition" button, and configure as proffered. Important fields are the period and the bucket selection (e.g. entire or partial by prefix) Warning: make sure you don't pick the expiration option, this will delete your data!
7. Now you're good to go, your data will be moved to the Glacier layer after 31 days.
Recently I noticed that our increasingly busy hive server started to fail more and more (couple of times a month). This caused issues in other parts of the system. To resolve some basic problems (dead process, not listening on the correct port) I wrote a little script that can run every minute (from the crontab).
In the past 6 months I have been active as Software Engineer at FlxOne. Really got a lot of opportunities to extend my knowledge.
As from the 27th of November I now wear the job title Data Architect. This is due my daily activities working with software and data.
The software I developed was all around data, with an even stronger focus on data architecture. Things like setting up a data warehouse (currently 96TB of storage, bunch of processors, even more RAM), lots and lots of ETL and all other data architect related things.
So that's about it. Me being a Data Architect with a passion for developing the fastest, best performing software.
Would you like to become my right hand as Online Zookeeper or are you interested in the whole RTB thing? We still have some jobs open. Even one for a FlxWizard. Becoming a FlxWizard is like mixing marketing, technology and complex problems in one single job. Nice one though!