Mercurial > hg > Feedworm
annotate FeedUpdater.py @ 9:fd4c8bfa62d6
FeedUpdater throws an exception if the URL could not be retrieved successfully. Includes unit tests.
author | Dirk Olmes <dirk@xanthippe.ping.de> |
---|---|
date | Tue, 27 Apr 2010 10:22:35 +0200 |
parents | 215c34f61e95 |
children | 01a86b178e60 |
rev | line source |
---|---|
4
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
1 |
5
bfd47f55d85b
add the updated date of the feed
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
4
diff
changeset
|
2 from datetime import datetime |
4
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
3 from Feed import Feed |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
4 from FeedEntry import FeedEntry |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
5 import feedparser |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
6 |
9
fd4c8bfa62d6
FeedUpdater throws an exception if the URL could not be retrieved successfully. Includes unit tests.
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
7
diff
changeset
|
7 STATUS_OK = 200 |
fd4c8bfa62d6
FeedUpdater throws an exception if the URL could not be retrieved successfully. Includes unit tests.
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
7
diff
changeset
|
8 |
4
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
9 def updateAllFeeds(session): |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
10 allFeeds = session.query(Feed) |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
11 for feed in allFeeds: |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
12 FeedUpdater(session, feed).update() |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
13 session.commit() |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
14 |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
15 class FeedUpdater(object): |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
16 def __init__(self, session, feed): |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
17 self.session = session |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
18 self.feed = feed |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
19 |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
20 def update(self): |
9
fd4c8bfa62d6
FeedUpdater throws an exception if the URL could not be retrieved successfully. Includes unit tests.
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
7
diff
changeset
|
21 result = self.getFeed() |
4
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
22 for entry in result.entries: |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
23 self.processEntry(entry) |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
24 |
9
fd4c8bfa62d6
FeedUpdater throws an exception if the URL could not be retrieved successfully. Includes unit tests.
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
7
diff
changeset
|
25 def getFeed(self): |
fd4c8bfa62d6
FeedUpdater throws an exception if the URL could not be retrieved successfully. Includes unit tests.
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
7
diff
changeset
|
26 result = feedparser.parse(self.feed.rss_url) |
fd4c8bfa62d6
FeedUpdater throws an exception if the URL could not be retrieved successfully. Includes unit tests.
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
7
diff
changeset
|
27 if result["status"] is not STATUS_OK: |
fd4c8bfa62d6
FeedUpdater throws an exception if the URL could not be retrieved successfully. Includes unit tests.
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
7
diff
changeset
|
28 raise FeedUpdateException() |
fd4c8bfa62d6
FeedUpdater throws an exception if the URL could not be retrieved successfully. Includes unit tests.
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
7
diff
changeset
|
29 return result |
fd4c8bfa62d6
FeedUpdater throws an exception if the URL could not be retrieved successfully. Includes unit tests.
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
7
diff
changeset
|
30 |
4
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
31 def processEntry(self, entry): |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
32 feedEntry = FeedEntry.findById(entry.id, self.session) |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
33 if feedEntry is None: |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
34 self.createFeedEntry(entry) |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
35 |
e0199f383442
retrieve a feed for the given URL, store entries as feed_entry rows into the database
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
diff
changeset
|
36 def createFeedEntry(self, entry): |
5
bfd47f55d85b
add the updated date of the feed
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
4
diff
changeset
|
37 new = FeedEntry() |
bfd47f55d85b
add the updated date of the feed
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
4
diff
changeset
|
38 new.id = entry.id |
bfd47f55d85b
add the updated date of the feed
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
4
diff
changeset
|
39 new.link = entry.link |
bfd47f55d85b
add the updated date of the feed
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
4
diff
changeset
|
40 new.title = entry.title |
bfd47f55d85b
add the updated date of the feed
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
4
diff
changeset
|
41 new.updated = datetime(*entry.updated_parsed[:6]) |
bfd47f55d85b
add the updated date of the feed
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
4
diff
changeset
|
42 new.summary = entry.summary |
bfd47f55d85b
add the updated date of the feed
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
4
diff
changeset
|
43 new.feed = self.feed |
bfd47f55d85b
add the updated date of the feed
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
4
diff
changeset
|
44 self.session.add(new) |
9
fd4c8bfa62d6
FeedUpdater throws an exception if the URL could not be retrieved successfully. Includes unit tests.
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
7
diff
changeset
|
45 |
fd4c8bfa62d6
FeedUpdater throws an exception if the URL could not be retrieved successfully. Includes unit tests.
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
7
diff
changeset
|
46 class FeedUpdateException(Exception): |
fd4c8bfa62d6
FeedUpdater throws an exception if the URL could not be retrieved successfully. Includes unit tests.
Dirk Olmes <dirk@xanthippe.ping.de>
parents:
7
diff
changeset
|
47 pass |