Posts Tagged Under: scrapy

scrapylogo

Installing Scrapy on Amazon Linux

I’ve recently moved all my AWS instances over to Amazon Linux and wanted to write a short update to installing Scrapy as the process is slightly different from Ubuntu.

Why Amazon Linux?

Amazon Linux is a distribution that evolved from Red Hat Enterprise Linux (RHEL) and CentOS. It is available for use

Read More

scrapy_and_aws

Scrapy and DynamoDB on AWS

Amazon DynamoDB is a fully managed proprietary NoSQL database service that is offered by Amazon.com as part of the Amazon Web Services portfolio.

If you’ve considered using MongoDB for storing your scraped results and if like me you’re doing your scraping from the cloud anyway then why not make use of

Read More

scrapy_and_aws

Installing scrapy and scrapyd on AWS EC2

See the updated version for installing scrapy 1.0 and above here.

This post will cover the basics of getting started with Amazon AWS, creating an account, creating an EC2 instance, installing scrapy and scrapyd and finally making sure you do it all for free!

Getting Started

Keeping It Free!

Before you go any

Read More

unnamed

ImportError: No module named settings

So after scratching my head for a while as to why my shiny new spider wasn’t doing what it was supposed to do I finally found this issue which hasn’t been merged in as yet.

Simple fix, do not call your scrapy project test!

Read More

scrapy_and_aws

Deploying scrapy on EC2

Welcome to part 3 of my guide to using AWS for scraping.

If you haven’t already make sure you check the first two parts, here and here. We’re going to continue using the same EC2 instance you created in part two.

Some assumptions before we begin

I’m going to assume a

Read More