Complete EPG data updated in less than 10 seconds (1 Viewer)

SquonkSC

Portal Member
July 24, 2006
21
0
Hi All,

I am new to this forum, and I don't know if what I am about to write is already known to everybody, if that's the case I appologise.

But!

If not, I'm glad I could help.

Downloading xmltv data can take a long time, especially when you use the parameter --slow.

The xmltv.exe does not only download the info, it has to sort and clean the data.
This does not only put strain on your CPU and internet bandwith,
it also puts a lot of strain on the websites that contain the data.

With Mediaportal and others becoming ever more popular, these websites will get more and more stressed and they will for sure take action to stop this from happening.

So I thought to myself, why are we all doing the same thing while we could let one do the sorting and cleaning and the rest of us just download the sorted data?

Well, it appears I wasn't the first to come up with this idea.

on http://xmltv.assies.info/ you can find the xmltv files for 3 days already cleaned and sorted.
Only problem is how to combine these to one tvguide.xml.

Well, someone has fixed that too

I stumbled upon this site http://members.home.nl/robertprins/

There you can download a vbs script that does the trick.
Even more so it can combine any xmltv files to one tvguide.

If you read the story on this guys website, and follow the instructions carefully, you'll find that it's a piece of cake.

From now on, downloading and updating the epg only takes a few seconds.

Fortunately for Dutch, but unfortunately for others, the whole site and readme's are in Dutch.

With some modifications to the files,this script can also work for other countries.

If people are interested, I will contact the guy and ask him if it's ok for me to write translations,
I can write in English, German, Dutch and Portuguese, the rest will be up to other people.


Hope this was helpful.

Greetz,

SquonkSC
 

Alconja

Portal Member
July 12, 2006
17
0
Sydney, Australia
I couldn't agree more. For countries without publically available xml epg data available, we need to merge our efforts so that everyone is not running scrapers all the time (& therefore making the owners of the websites do things like use encrypted links - part of the reason they do this is probably to try & prevent the mass DDOS attacks coming from HTPC/PVR owners trying to get EPG data).

Not sure which country the data you've listed there is for, but for Australian EPG data I recommend the script here: https://www.team-mediaportal.com/files/Download/Plugins/EPG/GetAussieEPG/ which uses JRobbo's data (more info here: http://wiki.dvbowners.com/index.php/JRobbo's_EPG_Guides). Nothing beats downloading 7 days worth of data in 10 seconds.

At the moment the data doesn't have full length descriptions or genre information, but that's because no scrapers currently do this. As soon as one of them starts working again, JRobbo's data will be complete again (i.e. you don't need to be messing around with the scrapers by hand yourself when things change).
 

Callifo

Retired Team Member
  • Premium Supporter
  • December 7, 2004
    1,439
    21
    Adelaide, Australia
    Home Country
    Alconja, I used that site as a fill in a couple of times, but its not a week, its only 3 days (including the day your on). Or it is for my region anyway. Hence I went back to scraping Ninemsn, just need to wait for DUGG to come back.
     

    Alconja

    Portal Member
    July 12, 2006
    17
    0
    Sydney, Australia
    It certainly used to have a week of data for me (sydney) previously, but I haven't checked since the current data issues (ninemsn) came up. As for waiting for DUGG to come back, I remember reading over on the DVBOwners forums that JRobbo's data comes from 4 seperate scrapers, but none of them are getting all the data properly at the moment and its a pretty safe bet that one of the scrapers is DUGG. Then point i'm getting at is that when DUGG (or one of the other scrapers) comes back, JRobbo's data will come back.

    For the long term I think its a much better process to have a minimal number of scrapers running against sites like ninemsn's and putting the gathered xml epg somewhere central. Then everyone who wants the data can get it ready-made from a central source (whether its JRobbo's one or another... perhaps the MP team could set up a single scraper somewhere & host the zipped xml files on the MP servers).

    This has two benefits: 1) It means we get the same data much quicker, since you just have to download one smallish zip file, rather than hundreds of web pages. and 2) It means that there won't be heaps of people all pounding sites like ninemsn requestng every single page over & over (which would go off like a red light in their logs & cause them to try & block us out again using ever more complicated tactics).

    Sorry for the rant... Just my humble opinion. :)
     

    Callifo

    Retired Team Member
  • Premium Supporter
  • December 7, 2004
    1,439
    21
    Adelaide, Australia
    Home Country
    While that tactic might slow MPs contribution to scraping Ninemsn, it wont really stop the largest part of the traffic. Between WS and MCE2005, those users are gonna be the real large problem in terms of bandwidth. Gotta convince them too.

    At the moment Im using the WS Ninemsn scrapper, and that seems to have have been far less effected by Ninemsn than DUGG unfortunately. That does a little bit of caching, but it doesnt retreive the extra program info (doesnt download associated pages which on it self is a lot of bandwidth saved).
     

    zombiepig

    Portal Pro
    March 21, 2005
    408
    0
    Melb, Aus
    Home Country
    when dugg was working, it was sending the results to an ftp site anyway. so if you where the first to grab the data from msn, your results got sent to the dugg ftp. then everyone else just got it from the ftp!!
     

    Alconja

    Portal Member
    July 12, 2006
    17
    0
    Sydney, Australia
    EqualRightsForWerewolves said:
    when dugg was working, it was sending the results to an ftp site anyway. so if you where the first to grab the data from msn, your results got sent to the dugg ftp. then everyone else just got it from the ftp!!
    Ahh, apologies then, didn't realise DUGG was already doing this. Hopefully DUGG comes back soon then & I'll give it a go.
     

    Callifo

    Retired Team Member
  • Premium Supporter
  • December 7, 2004
    1,439
    21
    Adelaide, Australia
    Home Country
    I realise that, but it wasnt always accurate, when shows changed, you werent able to fix that without the 'noftp' option which then changed it back to a normal scrapper.

    If in the long run, if people want to reduce load on ninemsn, you will have to convince other users (WS and MCE2005 more specifically) to move to a more unified solution, despite the fact they might be perfectly happy with their current solution.
     

    Lyxalig

    MP Donator
  • Premium Supporter
  • January 30, 2005
    276
    1
    35
    Norway
    Home Country
    Norway Norway
    What about the legal issues??
    I'm pretty sure this is the reason this is not getting sone on a larger scale.. :?
     

    Alconja

    Portal Member
    July 12, 2006
    17
    0
    Sydney, Australia
    Callifo said:
    I realise that, but it wasnt always accurate, when shows changed, you werent able to fix that without the 'noftp' option which then changed it back to a normal scrapper.

    If in the long run, if people want to reduce load on ninemsn, you will have to convince other users (WS and MCE2005 more specifically) to move to a more unified solution, despite the fact they might be perfectly happy with their current solution.
    I agree about the need for broad base support... not sure how best to achieve that (if its possible at all - as you say if people are on a good thing they won't switch). I guess if something like DUGG emerges as the obvious best, then hopefully people will migrate naturally. As for getting stale data, I guess DUGG needs to go back to the original source (ninemsn or whereever) once every X hours (24? 12? 6?) to make sure its still got the latest... is DUGG open source / still actively being developed?
     

    Users Who Are Viewing This Thread (Users: 0, Guests: 1)

    Top Bottom