samizdat-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

import_feeds-0.4 (relative to 20070506)


From: boud
Subject: import_feeds-0.4 (relative to 20070506)
Date: Mon, 7 May 2007 01:26:42 +0200 (CEST)

hi samizdat-devel,

The old patches pre-20070501 clearly won't work any more.

In this and the following message i'm sending patches for
import_feeds-0.4 and calendar-0.3 that should work with 20070506 (they
do for me). There are some minor improvements relative to the previous
patches distributed on the samizdat-devel list, but they are still
very much hacks-that-should-work-until-something-better-comes-along.


import_feeds-0.4 (cf http://lists.gnu.org/archive/html/samizdat-devel/2006-10/msg00004.html )

* now gives (and caches) feeds in preferred language (e.g. @request.language)

* uses div class="title", class="info" for css styling

* The first improvement on import_feeds - importing in the user's
preferred language rather than the site default - is probably quite
important (though it was trivial to add) for usability, especially for
e.g. a site like www.indymedia.org - we should be able to do a better
job of the www.indy right-hand column in samizdat, e.g. on
samizdat.axxs.org once we have import_feeds really working well.

* The second improvement makes the titles more pleasant to click on,
since they are now homogenised with the titles of local articles through
css styles (though the font remains smaller).


cheers
boud




--- /tmp/tmp_snapshot/samizdat/lib/samizdat/controllers/frontpage_controller.rb 
2007-05-05 14:15:07.000000000 +0200
+++ /usr/lib/ruby/1.8/samizdat/controllers/frontpage_controller.rb      
2007-05-07 01:01:14.127633840 +0200
@@ -8,6 +8,8 @@
 #
 # vim: et sw=2 sts=2 ts=8 tw=0

+require 'samizdat/helpers/import_feeds_helper'
+
 class FrontpageController < Controller

   def index
@@ -70,6 +72,14 @@
         nav_rss(rss_updates) << nav(updates.size))
     end

+    imported_feeds = ""   # default is zero-length string
+    lang_i_f= @request.language or  config['locale']['languages'][0] or 'en'
+    if( config['import_feeds'] )
+ imported_feeds = %{<tr><td class="links-head">}+ _('RDF Feeds')+ + '</td></tr>
+         <tr><td class="links">' + import_feeds_method(lang_i_f) + '</td></tr>'
+ end +
     page =
       if full_front_page
 %{<table>
@@ -78,10 +88,10 @@
   </thead>
   <tr>
     <td class="focuses">#{focuses}</td>
-    <td class="features" rowspan="3">#{features}</td>
-    <td class="updates" rowspan="3">#{updates}</td>
-  </tr>
-  <tr><td class="links-head">}+_('Links')+'</td></tr>
+    <td class="features" rowspan="6">#{features}</td>
+    <td class="updates" rowspan="6">#{updates}</td>
+ </tr>} + imported_feeds + + %{<tr><td class="links-head">}+_('Links')+'</td></tr>
   <tr><td class="links">
     <div class="focus"><a href="query/run?q='+CGI.escape('SELECT ?resource WHERE (dc::date 
?resource ?date) (s::inReplyTo ?resource ?parent) LITERAL ?parent IS NOT NULL ORDER BY ?date DESC')+'">'+_('All 
Replies')+'</a></div>
     <div class="focus"><a href="moderation">'+_('Moderation Log')+'</a></div>


--- /dev/null   2005-09-15 04:53:34.000000000 +0200
+++ /usr/lib/ruby/1.8/samizdat/helpers/import_feeds_helper.rb   2007-05-07 
01:05:37.890535784 +0200
@@ -0,0 +1,181 @@
+#!/usr/bin/env ruby
+#
+# Samizdat logout
+#
+#   Copyright (c) 2002-2006  Dmitry Borodaenko <address@hidden>,
+#   Boud (Indymedia) <address@hidden>
+#
+#   This program is free software.
+#   You can distribute/modify this program under the terms of
+#   the GNU General Public License version 2 or later.
+#
+# vim: et sw=2 sts=2 ts=8 tw=0
+
+# VERSION import_feeds 0.3
+
+require 'samizdat/engine'
+
+require 'open-uri'
+require 'rss/1.0'
+require 'rss/dublincore'
+require 'rss/2.0'
+
+
+# TODO: The format_date method is from template.rb. In principle,
+# imported feeds should (could) be treated as resources - somewhat
+# similar to messages, but with some properties distinct from ordinary
+# messages. In that case, there would be no need to have redundancy
+# for the format_date method.
+def format_date(date)
+  date = date.to_time if date.methods.include? 'to_time'   # duck
+  date = date.strftime '%Y-%m-%d %H:%M' if date.kind_of? Time
+  date
+end
+
+
+def import_feeds_method(lang='en')
+ + import_feeds_body = "<ul>"
+
+  interval = config['timeout']['import_feeds'] # time interval for importing
+  interval = 3600 if (interval == nil)  # failsafe default
+  timenow = Time.now  # object of Time class
+
+  # The expected caching time is the last "round number" time interval,
+  # based on total time in seconds defined in the Time class.
+  expected_caching_time = timenow.to_i.divmod(interval)[0] * interval
+ import_feeds_cache_key = 'imported_feeds/' + lang + '/' + + expected_caching_time.to_s
+
+  import_feeds_list_array  = cache[import_feeds_cache_key]
+ + if(import_feeds_list_array == nil)
+
+    import_feeds_list = Hash.new
+
+    config['import_feeds'].each do | feed_key, feed_value |
+      rss_source = feed_key
+
+      # At some point in the future, people might want to have e.g. https
+      # feeds, but there is no need to force people to write http:// when
+      # this is a very widely used default value. So protocol is optional
+      # here.
+
+ protocol = feed_value['protocol'] + protocol = "http://"; if( protocol == nil) +
+      host = feed_value['host']
+      host = _(' Hostname missing.') if (host == nil)
+      filename = feed_value['filename']
+      filename = _(' Filename missing.') if (filename == nil)
+      anURI = protocol + host + filename
+      #    anURI = protocol + feed_value['host'] + feed_value['filename']
+ + # TODO: security - check before untainting?
+      # TODO: store and prepare rdf feeds in all available languages
+ # and give the user the one s/he wants? + # 20070504: DONE - but only prepare and cache in requested language
+      response= ""
+      valid_URI=0
+      begin
+ open(anURI.untaint, +# "Accept-Language" => config['locale']['languages'][0]) do |file| + "Accept-Language" => lang) do |file| + response += file.read + valid_URI=1
+        end
+      rescue SocketError
+        valid_URI=0
+        import_feeds_body += _('<li><em>Error opening ') + %{<a href="} +
+         anURI + %{">} + _('this feed') + "</a></em></li>\n"
+      rescue URI::InvalidURIError
+        valid_URI=0
+        import_feeds_body += _('<li><em>Error opening ') + %{<a href="} +
+         anURI + %{">} + _('this feed') + "</a></em></li>\n"
+      rescue
+        valid_URI=0
+        import_feeds_body += _('<li><em>Error opening ') + %{<a href="} +
+         anURI + %{">} + _('this feed') + "</a></em></li>\n"
+      end
+
+      if(valid_URI==1)
+
+        # Remove tag section not needed and known to be buggy for
+        # invalid "mn" type URI  http://usefulinc.com/rss/manifest/
+       if response =~ %r{http://usefulinc.com/rss/manifest/}
+           
response.sub!(/<rdf:Description(.*\n)*?.*mn:channels.*(.*\n)*?.*<\/rdf:Description>/,"")
+        end
+
+ # The parsing of the feed initially allows non-RSS-1.0 compliant + # feeds, but the do_validate method is used on individual items
+        # later on to check their validity.
+        begin
+          rss = RSS::Parser.parse(response)  # for RSS 1.0 compliant feeds
+ rescue RSS::InvalidRSSError + rss = RSS::Parser.parse(response, false) # allow non RSS 1.0 compliant
+        end
+ + if(rss) + # rss.channel in RSS 2.0 seems to contain info in "rss" for RSS 1.0 + # So rss_channel is used here as a commmon name for either. + rss_channel = rss + if rss.rss_version == "2.0"
+            rss_channel = rss.channel
+          end
+ + # if there is a 'max_entries' parameter, then use at most that
+          # number of items for that feed
+          n_items=rss_channel.items.length
+          if(feed_value['max_entries'])
+            if(n_items > feed_value['max_entries'])
+              n_items = feed_value['max_entries']
+            end
+          end
+ + for item_number in 0...n_items
+            if rss_channel.item(item_number).do_validate
+              rss_link = rss_channel.item(item_number).link.strip
+              title = rss_channel.item(item_number).title.strip
+              date = format_date(rss_channel.item(item_number).date)
+ + # add this feed to the list of valid feeds + import_feeds_list[rss_link] = { "rss_source" => rss_source, + "title" => title, "date" => date } + + end
+          end  #     import_feeds_list.each { | feed_key, feed_value |
+        end  #    if(rss)
+      end #  if(valid_URI==1)
+    end # for feed_number in ...
+
+
+
+ + # Sort the import feeds list by date. The result is an array of
+    # pairs.  The first element of each pair is the link (in principle,
+ # this should be unique). The second element of each pair is + # a hash, containing the other useful pieces of feed
+    # information (such as source, title, date)
+ import_feeds_list_array = import_feeds_list.sort { + |a,b| b[1]['date'] <=> a[1]['date'] } + + # update the cache + cache[import_feeds_cache_key] = import_feeds_list_array
+
+  end #    if(import_feeds_list_array == nil)
+
+  import_feeds_list_array.each do | feed |
+ import_feeds_body += + %{<li> <div class="title"> <a title="} + + _('Click to view this external resource') +
+        %{" href="} + feed[0] + %{">} + feed[1]['title'] + "</a> </div>\n" +
+        %{<div class="info">} + feed[1]['rss_source'] + ", " +
+ feed[1]['date'] + "</li></div>\n" + end
+
+  import_feeds_body +=  "</ul>"
+ + import_feeds_body + +end # def import_feeds_method +






reply via email to

[Prev in Thread] Current Thread [Next in Thread]