WebEPG pagegrab delay not working correctly (1 Viewer)

benjerry

MP Donator
  • Premium Supporter
  • September 26, 2007
    167
    10
    Home Country
    Netherlands Netherlands
    MediaPortal Version: 1.1.0 RC3
    MediaPortal Skin: Blue3 Wide
    Windows Version: Windows XP proff SP3
    CPU Type: AMD Athlon 64 x2 5000+ Brisbane
    HDD: Seagate Momentus 5400.3 80gb
    Memory: 2Gb 6400
    Motherboard: Asus M2A-VM HDMI
    Video Card: Sapphire HD4670
    Video Card Driver: Catalyst 9.5
    Sound Card: ATI/Realtek HDMI (by video card)
    Sound Card AC3:
    Sound Card Driver: 5.0.50000.15
    1. TV Card: FloppyDTV S2
    1. TV Card Type: DVB-S
    1. TV Card Driver: 5.0.1.0
    2. TV Card: Mystique SaTiX-S2 V2 Dual
    2. TV Card Type: DVB-S
    2. TV Card Driver: 1.1.0.28
    3. TV Card:
    3. TV Card Type:
    3. TV Card Driver:
    4. TV Card:
    4. TV Card Type:
    4. TV Card Driver:
    MPEG2 Video Codec: PowerDVD 7
    MPEG2 Audio Codec: ffdshow
    h.264 Video Codec: PowerDVD 8
    Satelite/CableTV Provider: 28.5E, 23.5E, 19.2E, 13.0E
    HTPC Case: SuperPower Cube MARS
    Cooling: Cooler Master Gemini II
    Power Supply: Antec Earthwatts 380w
    Remote: Microsoft MCE Remote
    TV: Panasonic 50PZ80
    TV - HTPC Connection: HDMI via Denon AVR-1910


    Hi,

    I'm working on a new grabber for a website which is protected against quick page requests (tvgids.upc.nl), so I also need to use the delay option.

    However, the delay does not seem to work correctly (on TV-Server RC3).
    When I look in tv.log, I can see a delay of 2 times the specified amount in script, 12000ms= 24sec, before each listing page request.
    After that, list page is grabbed and ALL sublink pages are grabbed in quick sequence without any delays in between resulting in refused pages (see 2010-06-02 10:12:52.715571 in tv.log).
     

    benjerry

    MP Donator
  • Premium Supporter
  • September 26, 2007
    167
    10
    Home Country
    Netherlands Netherlands
    I will take a look and report back.

    Thanks, I hope that you can find it.

    I've taken a look into the code, but it's difficult searching and I don't have a developer environment here.

    I've seen a minimum of 500 being used, which should be 500ms? That seems unlikely the case based on data in my tv.log file.
     

    arion_p

    Retired Team Member
  • Premium Supporter
  • February 7, 2007
    3,373
    1,626
    Athens
    Home Country
    Greece Greece
    Try the attached TvLibrary.Utils.dll

    It should solve the problem. It seems the delay was not honored in sublinks. Also the delay is currently applied twice. So I need to find out a way to apply the delay only once, and also make sure the above fix does not break all grabbers. Perhaps sublinks should have a delay configurable separately.
     

    Attachments

    • TvLibrary.Utils.dll.zip
      19.4 KB

    benjerry

    MP Donator
  • Premium Supporter
  • September 26, 2007
    167
    10
    Home Country
    Netherlands Netherlands
    Try the attached TvLibrary.Utils.dll

    It should solve the problem. It seems the delay was not honored in sublinks. Also the delay is currently applied twice. So I need to find out a way to apply the delay only once, and also make sure the above fix does not break all grabbers. Perhaps sublinks should have a delay configurable separately.

    Thanks a lot!

    Delays are are working now also for sublinks. Test grabbing at the moment.. no error for so far.

    However, because now sublinks delays are working, like expected 2 times applied, the minimum delay it seems of 500ms (*2=1sec) is now also applied as well for websites for which it's not needed. Grabbing is now slower than neccesary.

    That about separate sublink delay also crossed my mind.
    Could be added to Link url so it will be like Site url:

    <Site url="" post="" external="" encoding="" delay=""/>
    <Link url="" post="" external="" encoding="" />

    Is data for Link url initialised with Site url data?
    It would be nice to be able to do just like this, when everythhing else is the same:
    < Link delay="1000" />
     

    arion_p

    Retired Team Member
  • Premium Supporter
  • February 7, 2007
    3,373
    1,626
    Athens
    Home Country
    Greece Greece
    That is right. Sublink Url is initialized from Site Url.

    That extra delay was my main concern, so I will try to add a "delay" attribute in the Link tag, but I also have to remove the 500ms minimum and the double delay.
     

    benjerry

    MP Donator
  • Premium Supporter
  • September 26, 2007
    167
    10
    Home Country
    Netherlands Netherlands
    I had one error on the first grabbing session. It didn't make much sense, because I think that the delays in between where long enough. It occured on grabbing the first list page of a channel. Perhaps a retry option would be nice, so when it fails to match any programmes on a page, it will load again the same webpage after a delay and tries matching again.
    I will go continue testing. :)

    Btw, another issue came up in another new grabber.

    This fails:

    <Channels>
    <Channel id="cbsaction.co.uk" siteId="cbsaction" />
    </Channels>
    <Listing type="Html">
    <Site url="http://www.[ID].co.uk/tv_guide.php?section=day&amp;date=[YYYY]:[MM]:[DD]" post="" external="false" encoding="" />

    And this is fine:

    <Channels>
    <Channel id="cbsaction.co.uk" siteId="1" />
    </Channels>
    <Listing type="Html">
    <Site url="http://www.cbsaction.co.uk/tv_guide.php?section=day&amp;date=[YYYY]:[MM]:[DD]" post="" external="false" encoding="" />

    However, this prefents me from adding cbsreality and cbsdrama to the same grabber.

    Another example of multiple websites with same layout which could be done with one grabber:
    dutch SBS channels:

    NET 5 - TV Gids
    Veronica TV - TV Gids
    SBS 6 - TV-gids
     

    arion_p

    Retired Team Member
  • Premium Supporter
  • February 7, 2007
    3,373
    1,626
    Athens
    Home Country
    Greece Greece
    IIRC only the querystring part of the URL can be templated, not the hostname/path.
     

    benjerry

    MP Donator
  • Premium Supporter
  • September 26, 2007
    167
    10
    Home Country
    Netherlands Netherlands
    IIRC only the querystring part of the URL can be templated, not the hostname/path.

    Oh ok. It would have been more convenient and efficient, but it's not a big deal.

    Something new again:
    I've run into some problem with my new tvgids.upc.nl grabber which is the one using the delay option. Grabbing seems to be fine now for all channels except for this one. It's always the the same which fails.

    It's grabbing url is:
    Cartoon Network - TV-gids UPC

    I always get "WebEPG: No Listings Found" in tv.log.

    When I check with IE8 it looks the same as other channels.

    When I check with WebEPG Designer this channel then the Html source indeed contains no program information.

    It's weird, all other channels, so far tested, are fine.
    For instance, Nat.Geo Wild - TV-gids UPC

    Is it a problem with the dot and slash "./" in the url?

    I wonder if it's client side or some server side protection.

    Edit:
    I've tested with wget.exe util and that one grabs all program information. It must be something client side in WebEPG or something WebEPG does not do like pretending to be an IE browser.

    Edit2: incorrect to say all others are fine because didn't test all.
     

    benjerry

    MP Donator
  • Premium Supporter
  • September 26, 2007
    167
    10
    Home Country
    Netherlands Netherlands
    I've tried to use "external=true" pagegrab by IE defined in Site tag:

    <Site url="http://tvgids.upc.nl/TV/Guide/Channel/[ID]/[DAY_NAME]/" post="" external="true" encoding="" delay="1000" />

    But this option seems to be broken. See error.log attached.
     

    Users who are viewing this thread


    Write your reply...
    Top Bottom