I’ve recently moved all my AWS instances over to Amazon Linux and wanted to write a short update to installing Scrapy as the process is slightly different from Ubuntu.
Amazon Linux is a distribution that evolved from Red Hat Enterprise Linux (RHEL) and CentOS. It is available for use
The idea that the data flowing across a communication service providers (CSP) network could hold intrinsic value is nothing new. The principles of infonomics, that information should be accounted for with the same formality as traditional assets was first proposed in the late 90’s by Doug Laney, vice president and
Previously I walked through running Spark locally for development but one of the major challenges of learning to use distributed systems is understanding how the various components are installed and interact with each other in a production like environment.
You can use Vagrant or virtual machine images to run a cluster
As I prepared for the Developer Certification for Apache Spark by Databricks and O’Reilly i noticed that there weren’t that many resources around so I thought I’d collect and share the resources I used to prepare for the exam. Hope it helps!
I do a lot of different development
Amazon DynamoDB is a fully managed proprietary NoSQL database service that is offered by Amazon.com as part of the Amazon Web Services portfolio.
If you’ve considered using MongoDB for storing your scraped results and if like me you’re doing your scraping from the cloud anyway then why not make use of
Encryption. It’s a volatile subject at the moment, especially in the wake of last Friday’s horrific attacks in Paris. The UK government has been campaigning against unbreakable encryption in the run up to the publication of the Investigatory Powers Bill which was released last week and there are new calls