WebGrab+Plus a new xmltv grabber | Page 24

Discussion in 'xmlTV' started by WG++Maker, October 25, 2010.

  1. kleinerflo

    kleinerflo Portal Member

    Joined:
    January 25, 2009
    Messages:
    44
    Likes Received:
    6
    Occupation:
    Elektrotechniker
    Location:
    Bayern
    Ratings:
    +6 / 0
    Home Country:
    Germany Germany
    Hello community use Webgrab + + quite a few times and am very happy with it. I have but with a series of problems with the various grappern Result. NCIS LA would be the title but the LA appears in Subtitle
    This is from the Result Tvspielfilm.de this he also makes in tvtoday.de.
    I hope you can help me how this can be brought under control
    Code (Text):
    1.  <programme start="20130104160500 +0100" stop="20130104165000 +0100" channel="13th Street">
    2.     <title lang="de">Navy CIS</title>
    3.     <title lang="xx">NCIS: Los Angeles</title>
    4.     <sub-title lang="de">L. A.. Der einsame Wolf</sub-title>
    5.     <desc lang="de">Eine frühere Navy-Nachrichtendienstlerin, die für eine Friedensorganisation arbeitete, wird ermordet</desc>
    6.     <credits>
    7.       <director>James Whitmore Jr.</director>
    8.       <actor>Chris O'Donnell (Special Agent G. Callen)</actor>
    9.       <actor>LL Cool J (Special Agent Sam Hanna)</actor>
    10.       <actor>Daniela Ruah (Special Agent Kensi Blye)</actor>
    11.       <actor>Linda Hunt (Henrietta "Hetty" Lange)</actor>
    12.       <actor>Peter Cambor (Nate "Doc" Getz)</actor>
    13.       <actor>Eric Christian Olsen (Marty Deeks)</actor>
    14.       <actor>Barrett Foa (Eric Beale)</actor>
    15.     </credits>
    16.     <category lang="de">USA 2011</category>
    17.     <category lang="de">Krimiserie</category>
    18.     <date>2011</date>
    19.     <episode-num system="onscreen">Staffel 3|Folge 54</episode-num>



     
  2. Google AdSense Guest Advertisement



    to hide all adverts.
  3. corporate_gadfly

    corporate_gadfly Portal Pro

    Joined:
    May 17, 2011
    Messages:
    396
    Likes Received:
    68
    Ratings:
    +72 / 0
    Home Country:
    Canada Canada
    Thanks in advance for the wonderful program. I use tvguide.com for OTA EPG data for Buffalo (Toronto, actually - but close enough). Would someone be kind enough and willing to decipher the genre information for tvguide.com?

    I have figured out the following which may be helpful:
    • In the tab-delimited file, the genre information is always the 7th element in the row.
    Following lookup table can be used to figure out the genres:
    • 64 = movies
    • 1024 = sports
    • 2 = family
    • 256 = news
    • 1 = unknown?
    Now, someone who knows more about "scrubs" can perhaps help out with getting genre information for tvguide.com?

    Here's a small sampling of sports and news:


    Show Spoiler
    0 2.1 WGRZHD 5 The Tim McCarver Show 6 1024 8192 16 0 2 8 189347611822 201301200330 30 347390 0
    0 2.1 WGRZHD 5 On the Money With Maria Bartiromo 6 256 8192 16 0 2 12 21449700 11822 201301200600 30 205344 0
    0 2.1 WGRZHD 5 Sunday Daybreak 18 256 8192 16 0 2 12 12741185 11822 201301200630 90 0 0
    0 2.1 WGRZHD 5 Today 12 256 8192 16 0 2 44 21428577 11822 201301200800 60 0 0
    0 2.1 WGRZHD 5 Meet the Press 12 256 8192 16 0 2 44 21422376 11822 201301200900 60 203044 0
    0 2.1 WGRZHD 5 Sabres Pregame 6 1024 8192 16 0 2 8 19252951 11822 201301201200 30 347088 0
    0 2.1 WGRZHD 5 NHL Hockey:Flyers at Sabres 30 1024 8192 16 0 2 41 1149318 11822 201301201230 150 0 0
    0 2.1 WGRZHD 5 Skiing:U.S. Freestyle World Cup 12 1024 8192 16 0 2 40 4063434 11822 201301201500 60 0 0
    0 2.1 WGRZHD 5 Channel 2 News 6 256 8192 16 0 2 4 4270837 11822 201301201800 30 0 0
    0 2.1 WGRZHD 5 NBC Nightly News 6 256 8192 16 0 2 44 222693311822 201301201830 30 0 0
    0 2.1 WGRZHD 5 Channel 2 News 6 256 8192 16 0 2 4 4270838 11822 201301202300 30 0 0



    Again, thanks in advance.

    Cheers.
     
    Last edited: January 21, 2013
  4. WG++Maker

    WG++Maker Portal Pro

    Joined:
    October 25, 2010
    Messages:
    130
    Likes Received:
    56
    Gender:
    Male
    Occupation:
    retired microchip engineer
    Location:
    La Gomera, Canary Islands
    Ratings:
    +58 / 0
    Home Country:
    Spain Spain
    Hi Kleinerflo,

    sorry that I didn't reply earlier, for some reason the auto notification doesn't seem to work for me.

    I need a bit of time to sort it out, but will be back.

    WG++Maker .. Jan[DOUBLEPOST=1358785142][/DOUBLEPOST]
    Hi,

    I need a few days and will be back

    WG++Maker .. Jan
     
  5. silentbuteo2

    silentbuteo2 Portal Member

    Joined:
    January 20, 2013
    Messages:
    8
    Likes Received:
    0
    Gender:
    Male
    Ratings:
    +1 / 0
    Home Country:
    Belgium Belgium
    @kleinerflo
    I've tested this, but with me it is grabbed correctly. Are you using the latest version of the .ini file?
    The latest version is from 01/11/2012.
    here is the link: http://webgrabplus.com/sites/default/files/download/ini/detail/de_tvspielfilm.de.zip

    If the problem still occurs with the latest version, just let me know and i'll check further.

    Code (XML):
    1. <programme start="20130122143000 +0100" stop="20130122152000 +0100" channel="13th Street Universal">
    2.     <title lang="de">Navy CIS</title>
    3.     <sub-title lang="de">Max Destructo</sub-title>
    4.     <desc lang="de">Makabrer Auftakt: Gibbs (Mark Harmon) und sein Team entdecken in einer gestohlenen Damenhandtasche Fingerkuppen und Zähne eines Corporals. Der Fall führt zu einer Gruppe Computerfreaks (unter ihnen: Beth Riesgraf aus der Serie "Leverage", hier als Maxine) und zeigt, dass Gibbs mit moderner Technik immer noch auf Kriegsfuß steht.(n)</desc>
    5.     <category lang="de">Serie</category>
    6.     <category lang="de">Krimiserie</category>
    7.     <episode-num system="onscreen"> Folge 178</episode-num>
    8.   </programme>
    9.   <programme start="20130122152000 +0100" stop="20130122160500 +0100" channel="13th Street Universal">
    10.     <title lang="de">Navy CIS: L. A.</title>
    11.     <sub-title lang="de">Die Koreanerin</sub-title>
    12.     <desc lang="de">Wissenschaftler Daniel Su, der eine Hightechausrüstung fürs Marine Corps entwickelt, wird ermordet. Ein Überwachungsvideo zeigt "Die Koreanerin" Lee Wuan Kai, eine kaltblütige Auftragskillerin, mit der bereits der NCIS an der Ostküste zu tun hatte. Weshalb Laborgenie Abby (Pauley Perrette) ein Gastspiel gibt(n)</desc>
    13.     <category lang="de">Serie</category>
    14.     <category lang="de">Krimiserie</category>
    15.     <episode-num system="onscreen"> Folge 5</episode-num>
    16.   </programme>
    17.   <programme start="20130122160500 +0100" stop="20130122165000 +0100" channel="13th Street Universal">
    18.     <title lang="de">Navy CIS: L. A.</title>
    19.     <sub-title lang="de">Tinte in den Adern</sub-title>
    20.     <desc lang="de">Ein Marine fällt während einer Party von der Dachterrasse eines Hotels. Wie sich herausstellt, war er bereits ohnmächtig, als ihn jemand über die Brüstung warf. Eine Spur führt die Navy-Agenten Callen, Kensi und Sam (Chris O'Donnell, Daniela Ruah, LL Cool J) zu einer Falschgeldbande(n)</desc>
    21.     <category lang="de">Serie</category>
    22.     <category lang="de">Krimiserie</category>
    23.     <episode-num system="onscreen"> Folge 6</episode-num>
    24.   </programme>
     
    Last edited: January 22, 2013
  6. silentbuteo2

    silentbuteo2 Portal Member

    Joined:
    January 20, 2013
    Messages:
    8
    Likes Received:
    0
    Gender:
    Male
    Ratings:
    +1 / 0
    Home Country:
    Belgium Belgium
    @corporate_gadfly

    I have looked at this "problem" and found out that now only the subcategory is grabbed in the code. (basketball, football, comedy, ...)
    I adjusted the code to also grab the main category (movie, sports, family, news)
    Can you just test this. I tested it with some site, but before I put it in the release, I want you to test it also.
    Just change your .ini file with the code below. First find the line with "urldate.format" in your .ini and remove all what is below (inlcuding that line). And then append the code below.


    Code (Text):
    1. urldate.format {datestring|} * no value but required by the program
    2. index_showsplit.scrub {multi|'index_variable_element'||\n}
    3. index_temp_3.scrub {single(separator="\t" include=11)||||} *scrubs the show_id, needed for index_urlshow
    4. *
    5. index_date.scrub    {single(force)|||\t|}
    6. index_temp_1.scrub  {single(separator="\t" include=13)||||} * start in format yyyyMMddHHmm, we use substring
    7. index_title.scrub   {single(separator="\t" include=3)||||}
    8. index_temp_4.scrub  {single(debug separator="\t" include=5)||||} * category on the main page
    9. *
    10. title.scrub {single(separator="\t" include=2)||||<div style=}
    11. subtitle.scrub {single(separator="\t" include=3)||||<div style=}
    12. description.scrub {single(separator="\t" include=4)||||<div style=}
    13. director.scrub {single(separator="\t" include=14)||||<div style=}
    14. actor.scrub {single(single(separator="\t" include=15)||||<div style=}
    15. temp_1.scrub {single(single(separator="\t" include=11 exclude="other")||||<div style=} * category   (from detail page)
    16. temp_2.scrub {single(single(separator="\t" include=12 exclude="other")||||<div style=} * subcategory (from detail page)
    17. rating.scrub {single(single(separator="\t" include=8)||||<div style=}
    18. productiondate.scrub {single(single(separator="\t" include=10)||||<div style=}
    19. *
    20. * operations:
    21. *index_variable_element.modify {addstart|\t'config_xmltv_id'}
    22. *index_variable_element.modify {substring(type=word)|0 1}
    23. *index_variable_element.modify {addend|\t}
    24. scope.range {(datelogo)|end}
    25. index_variable_element.modify {addstart|'config_xmltv_id'}
    26. index_variable_element.modify {substring(type=word)|-1 1}
    27. * must contain a number
    28. index_temp_6.modify {calculate(format=F0)|'index_variable_element'}
    29. * clear if not a number
    30. index_variable_element.modify {clear('index_temp_6' "0")}
    31. index_variable_element.modify {addstart|\t}
    32. index_variable_element.modify {addend|\t}
    33. end_scope
    34. *
    35. scope.range {(indexshowdetails)|end}
    36. * correct date :
    37. index_date.modify {substring(type=char)|0 10}
    38. * compose start :
    39. index_temp_2.modify {substring(type=char)|'index_temp_1' 8 2} * the hours of start
    40. index_start.modify {addstart|'index_temp_2':} * add hours minutes separator
    41. index_temp_2.modify {substring(type=char)|'index_temp_1' -2} * the minutes of start
    42. index_start.modify {addend|'index_temp_2'}
    43. * compose index_urlshow :
    44. index_urlshow.modify {addstart('index_temp_3' not "")|http://www.tvguide.com/listings/data/detailcache.aspx?Qr='index_temp_3'&tvoid=0&v2=1}
    45. end_scope
    46. *
    47. title.modify {addstart(scope=showdetails "")|'index_title'}
    48. actor.modify {replace(scope=showdetails)|,|\|} * make actor multi a multi element
    49. * translate the category id to string
    50. index_temp_4.modify {replace("1")|1|}
    51. index_temp_4.modify {replace( "2")|2|family}
    52. index_temp_4.modify {replace("64")|64|movie}
    53. index_temp_4.modify {replace("256")|256|news}
    54. index_temp_4.modify {replace("1024")|1024|sports}
    55. * add all the categories together
    56. category.modify {addstart(scope=showdetails 'temp_2' not "")|'temp_2'\|} * add subcategory (from the detail page)
    57. category.modify {addstart(scope=showdetails 'temp_1' not "")|'temp_1'\|} * add category (from detail page)
    58. category.modify {addstart( 'index_temp_4' not "")|'index_temp_4'\|}   * add category    (from index page)
    59.  
    60. category.modify {cleanup(scope=showdetails removeduplicates=equal)}
    61.  
    62. **  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _  _
    63. **    #####  CHANNEL FILE CREATION (only to create the xxx-channel.xml file)
    64. **
    65. ** @auto_xml_channel_start
    66. ** the following 8 entries create a channel list file:
    67. *index_site_channel.scrub {multi(separator="\t")|magic=\n|||\n}
    68. *index_site_channel.modify {replace| |-} * replace space in channel name by -
    69. *index_site_channel.modify {replace|\b| } * replace char U+0008 (word separator) in space
    70. *index_site_channel.modify {substring(type=word)|0 2}
    71. *index_site_id.scrub {multi(separator="\t")|magic=\n|||\n}
    72. *index_site_id.modify {replace| |-}
    73. *index_site_id.modify {replace|\b| } * replace char U+0008 (word separator) in space
    74. *index_site_id.modify {substring(type=word)|-1}
    75. ** @auto_xml_channel_end
    76.  
     
  7. corporate_gadfly

    corporate_gadfly Portal Pro

    Joined:
    May 17, 2011
    Messages:
    396
    Likes Received:
    68
    Ratings:
    +72 / 0
    Home Country:
    Canada Canada
    I am so sorry @silentbuteo2. I probably made you do all that work for nothing. That's what happens when you encounter something new and unknown. I opened up my original tvguide.xml (the one processed by existing .ini files) and sure enough it has tons of <category> lines already:
    Code (Text):
    1.     <category lang="en">sports</category>
    2.     <category lang="en">hockey</category>
    3.  
    where, I guess, sports is the main category with hockey being the subcategory. So, unless there was something obviously wrong with the original .ini files. I am reluctant to change over to the new ones.

    Do you still want me to test?
     
  8. silentbuteo2

    silentbuteo2 Portal Member

    Joined:
    January 20, 2013
    Messages:
    8
    Likes Received:
    0
    Gender:
    Male
    Ratings:
    +1 / 0
    Home Country:
    Belgium Belgium
    @corporate_gadfly
    The original .ini file already contained the code to grab the main category, but it was commented out.
    But the main gategory was not always available. So I extended the code to do that. So now the code grabs the extra info.
    So if the old code works for you, just use that. If you want to test the new code, it is already released.
    http://webgrabplus.com/epg-channels#co
     
    Last edited: January 26, 2013
  9. WG++Maker

    WG++Maker Portal Pro

    Joined:
    October 25, 2010
    Messages:
    130
    Likes Received:
    56
    Gender:
    Male
    Occupation:
    retired microchip engineer
    Location:
    La Gomera, Canary Islands
    Ratings:
    +58 / 0
    Home Country:
    Spain Spain
    As of today there is also another place to get support for WebGrab+Plus .

    Visit its new website http://www.webgrabplus.com/

    See you there --- WG++Maker --- Jan
     
    • Like Like x 1
  10. tom78

    tom78 Portal Pro

    Joined:
    August 10, 2007
    Messages:
    149
    Likes Received:
    5
    Ratings:
    +5 / 0
    Home Country:
    Germany Germany
    Show System Specs
    Hello.
    Since a while i have a problem with the tvtv.de.ini-file. (Update to SiteINI 11.10 does't solve the problem)
    Webgrab downloads the index page, but then it shows the message "no shows in index page! Cannot find any shows in the index page !"

    Here's one channel for example:
    <channel update="i" site="tvtv.de" site_id="11" xmltv_id="VOX">VOX</channel>

    Could you please have a look at this?!
    Thanks!
     
  11. Lightning303
    • Premium Supporter

    Lightning303 MP Donator

    Joined:
    September 12, 2009
    Messages:
    798
    Likes Received:
    384
    Gender:
    Male
    Ratings:
    +578 / 0
    Home Country:
    Germany Germany
    Show System Specs
    same here :( seems tvtv.de was bought by another company, maybe they fiddled around with their code.
    would be great if WG++Maker could fix that (also the tvtv.de.xmltv_ns.ini version, as i am using that one ;P).
    thanks
     
Loading...

Users Viewing Thread (Users: 0, Guests: 0)

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice
  • About The Project

    The vision of the MediaPortal project is to create a free open source media centre application, which supports all advanced media centre functions, and is accessible to all Windows users.

    In reaching this goal we are working every day to make sure our software is one of the best.

             

  • Support MediaPortal!

    The team works very hard to make sure the community is running the best HTPC-software. We give away MediaPortal for free but hosting and software is not for us.

    Care to support our work with a few bucks? We'd really appreciate it!