Develop my own Grabber (1 Viewer)

DKreeK

Portal Pro
October 22, 2004
351
0
44
Home Country
France France
Hello,

I want to develop my own grabber but I have some problem. The only documentation I found it's the "commented-grabber.xml" file. But I have some questions :

- I want to use the tag [PAGE_OFFSET] in my URL site and to go from 1 to 5. If I read the documentation, I can use PageStart to start from 1 and not 0. But this don't work... Also, I don't find where i put the limit...

- Here an exemple of the HTML code
Code:
<div class="horaire" style="width: 97px;">&nbsp;00H25</div>
<div class="thematique-34782" style="width: 97px;">&nbsp;DOCUMENTAIRE</div>
<div class="info" style="height: 61px; width: 97px;"><p style="padding: 2px 3px;">
	<a href="javascript:var retour=popupSimple('tpl72.htm&cha=239&thema=34782&DID=16591057', 'Inscription', 400, 584);" class="tt-blanc10">Photo de couv'</a>
	</p>
</div>
<div class="duree" style="width: 97px;">25 min </div>
<div id="zoomEmission16591057" class="DetailEmission" onmouseout="MM_changeProp('zoomEmission16591057', '', 'style.visibility', 'hidden', 'DIV');">
	<p style="padding: 2px;">
		<span class="tt-noir10">PHOTO DE COUV'</span><br /><br />
		De 00h25 à 00h50		<br /><br />
		DOCUMENTAIRE		&nbsp;-&nbsp;SOCIETE<br />
		<br />
		Rediffusion<br />
	</p>
</div>

I want to have

Code:
  <programme start="20070102002500" end="20070102005000" channel="discoveryrealtime.fr">
    <title>Photo de couv'</title>
    <category>Documentaire - Societe</category>
  </programme>

So I wrote :

Code:
        <SectionTemplate tags="TPA">
          <TemplateText>
	&lt;span class="tt-noir10"&gt;&lt;#TITLE&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;
		De&lt;#START&gt;à&lt;#END&gt;&lt;br/&gt;&lt;br/&gt;
	&lt;#GENRE&gt;&lt;br /&gt;
    </TemplateText>
        </SectionTemplate>

I have a problem : It doesn't work and I have the log file :
Code:
2007-01-02 00:50:58.928256 [Info.][1]: WebEPG: ChannelId: discoveryrealtime.fr
2007-01-02 00:50:58.943881 [Debug][1]: WebEPG: Grab Start 00:50 02/01/2007
2007-01-02 00:50:58.990756 [Info.][1]: WebEPG: Reading http://www.canalsat.fr/index.php?tpl=217&DAY=02-01-07&cha=239&tra=0&multi=2 POST: 
2007-01-02 00:51:00.662631 [Info.][1]: WebEPG: Listing Count 25
2007-01-02 00:51:00.678256 [Warn.][1]: WebEPG: Program Count (0) < Listing Count (25), possible template error
2007-01-02 00:51:00.678256 [Info.][1]: WebEPG: Finished
Could you help me ? And it's possible to lowercase the title and the category like the php function "ucfirst" ?

I have also an other question, I want to get the description of the entry. But the description is in an athor page (a popup). I have the link before an for this exemple it's : tpl72.htm&cha=239&thema=34782&DID=16591057
It's possible to get the description from an other HTML page ?

Thank's in advance and an happy new year for every body !!!
 

DKreeK

Portal Pro
October 22, 2004
351
0
44
Home Country
France France
Thank you for the link and the program WebEPG-Designer-v3. It's help me very much. I have a little problem, for each day the program is on 5 page :
http://www.canalsat.fr/index.php?tpl=217&DAY=02-01-07&cha=239&tra=1&multi=2
http://www.canalsat.fr/index.php?tpl=217&DAY=02-01-07&cha=239&tra=2&multi=2
http://www.canalsat.fr/index.php?tpl=217&DAY=02-01-07&cha=239&tra=3&multi=2
http://www.canalsat.fr/index.php?tpl=217&DAY=02-01-07&cha=239&tra=4&multi=2
http://www.canalsat.fr/index.php?tpl=217&DAY=02-01-07&cha=239&tra=5&multi=2

So in the grabber i use : tra=[PAGE_OFFSET] and in the search tag, i put : startPage="1".

Here is the log file :
2007-01-02 22:17:40.500000 [Info.][1]: WebEPG: Starting
2007-01-02 22:17:40.562500 [Info.][1]: Loading ChannelMap: WebEPG.xml
2007-01-02 22:17:40.578125 [Info.][1]: WebEPG: Getting Channel 1 of 1
2007-01-02 22:17:40.578125 [Info.][1]: WebEPG: Opening FR\canalsat_fr.xml
2007-01-02 22:17:41.031250 [Info.][1]: WebEPG: TimeZone, Local: Paris, Madrid
2007-01-02 22:17:41.031250 [Info.][1]: WebEPG: TimeZone, Site : Paris, Madrid
2007-01-02 22:17:41.140625 [Info.][1]: WebEPG: ChannelId: discoveryrealtime.fr
2007-01-02 22:17:41.140625 [Debug][1]: WebEPG: Grab Start 22:17 02/01/2007
2007-01-02 22:17:41.187500 [Info.][1]: WebEPG: Reading http://www.canalsat.fr/index.php?tpl=217&DAY=02-01-07&cha=239&tra=1&multi=2 POST:
2007-01-02 22:17:42.593750 [Info.][1]: WebEPG: Listing Count 8
2007-01-02 22:17:42.609375 [Warn.][1]: WebEPG: Program Count (0) < Listing Count (8), possible template error
2007-01-02 22:17:42.609375 [Info.][1]: WebEPG: Reading http://www.canalsat.fr/index.php?tpl=217&DAY=02-01-07&cha=239&tra=2&multi=2 POST:
2007-01-02 22:17:44.390625 [Info.][1]: WebEPG: Listing Count 10
2007-01-02 22:17:44.390625 [Warn.][1]: WebEPG: Program Count (0) < Listing Count (10), possible template error
2007-01-02 22:17:44.390625 [Info.][1]: WebEPG: Reading http://www.canalsat.fr/index.php?tpl=217&DAY=02-01-07&cha=239&tra=3&multi=2 POST:
2007-01-02 22:17:45.671875 [Info.][1]: WebEPG: Listing Count 10
2007-01-02 22:17:45.671875 [Warn.][1]: WebEPG: Program Count (0) < Listing Count (10), possible template error
2007-01-02 22:17:45.671875 [Info.][1]: WebEPG: Reading http://www.canalsat.fr/index.php?tpl=217&DAY=02-01-07&cha=239&tra=4&multi=2 POST:
2007-01-02 22:17:46.890625 [Info.][1]: WebEPG: Listing Count 6
2007-01-02 22:17:46.890625 [Warn.][1]: WebEPG: Program Count (0) < Listing Count (6), possible template error
2007-01-02 22:17:46.890625 [Info.][1]: WebEPG: Reading http://www.canalsat.fr/index.php?tpl=217&DAY=02-01-07&cha=239&tra=5&multi=2 POST:
2007-01-02 22:17:47.953125 [Info.][1]: WebEPG: Listing Count 6
2007-01-02 22:17:47.968750 [Info.][1]: WebEPG: Guide, Program Info: 20070102223500 - En votre absence
2007-01-02 22:17:47.968750 [Info.][1]: WebEPG: Guide, Program Info: 20070102233000 - En votre absence
2007-01-02 22:17:47.968750 [Warn.][1]: WebEPG: Program Count (2) < Listing Count (6), possible template error
2007-01-02 22:17:47.968750 [Info.][1]: WebEPG: Reading http://www.canalsat.fr/index.php?tpl=217&DAY=02-01-07&cha=239&tra=6&multi=2 POST:
2007-01-02 22:17:49.625000 [Info.][1]: WebEPG: Listing Count 6
2007-01-02 22:17:49.625000 [Info.][1]: WebEPG: Guide, Program Info: 20070103060000 - Maison d'enfer
2007-01-02 22:17:49.625000 [Info.][1]: WebEPG: Guide, Program Info: 20070103185500 - En votre absence
2007-01-02 22:17:49.625000 [Info.][1]: WebEPG: Guide, Program Info: 20070103195000 - Mission rénovation
2007-01-02 22:17:49.625000 [Warn.][1]: WebEPG: Program Count (5) < Listing Count (6), possible template error
2007-01-02 22:17:49.625000 [Info.][1]: WebEPG: Reading http://www.canalsat.fr/index.php?tpl=217&DAY=02-01-07&cha=239&tra=7&multi=2 POST:
2007-01-02 22:17:50.640625 [Info.][1]: WebEPG: Listing Count 6
2007-01-02 22:17:50.640625 [Info.][1]: WebEPG: Guide, Program Info: 20070104060000 - Maison d'enfer
2007-01-02 22:17:50.640625 [Info.][1]: WebEPG: Guide, Program Info: 20070104185500 - En votre absence
2007-01-02 22:17:50.640625 [Info.][1]: WebEPG: Guide, Program Info: 20070104195000 - Mission rénovation
2007-01-02 22:17:50.640625 [Info.][1]: WebEPG: Reading http://www.canalsat.fr/index.php?tpl=217&DAY=02-01-07&cha=239&tra=8&multi=2 POST:
2007-01-02 22:17:51.843750 [Info.][1]: WebEPG: Listing Count 6
2007-01-02 22:17:51.843750 [Info.][1]: WebEPG: Guide, Program Info: 20070105060000 - Maison d'enfer
2007-01-02 22:17:51.843750 [Info.][1]: WebEPG: Guide, Program Info: 20070105185500 - En votre absence
2007-01-02 22:17:51.843750 [Info.][1]: WebEPG: Guide, Program Info: 20070105195000 - Mission rénovation
2007-01-02 22:17:51.859375 [Info.][1]: WebEPG: Finished

For each day I need to go from 1 to 5. But I don't now where i can said 5 is max for PAGE_OFFSET.

I don't understand alo : WebEPG: Program Count (0) < Listing Count (8), possible template error. He find the good number of entry but he said : "Template error" ???
 

James

Retired Team Member
  • Premium Supporter
  • May 6, 2005
    1,385
    67
    Switzerland
    There is a endPage it just wasn't in the docs, but is now ;)

    I added support for fixing the case correctly. I also made a config file for you which is now in the SVN. It only has one channel, so it would be great if you could add the rest and test it. The output looks like this:

    Code:
      <programme start="20070103232500" stop="20070104001500" channel="tf1.fr">
        <title>Preuve à l'appui</title>
        <category>Serie - Suspense</category>
      </programme>

    Enjoy.
     

    DKreeK

    Portal Pro
    October 22, 2004
    351
    0
    44
    Home Country
    France France
    Thank you, it's works better now. I'm waiting for the next SVN to test it. I have a question. I want to add <previously-shown /> in the XML file if there is the text "Rediffusion" in my HTML template.

    In the wiki I see the Tag <z> why regexp but i don't find any exemple of how to use it.

    Thank's in advance.
     

    James

    Retired Team Member
  • Premium Supporter
  • May 6, 2005
    1,385
    67
    Switzerland
    You want to use this:

    <Search match="Rediffusion" field="#REPEAT" remove="false" />

    I have updated the docs with how to configure this.
     

    DKreeK

    Portal Pro
    October 22, 2004
    351
    0
    44
    Home Country
    France France
    Ok, thank you it's work fine. To finish my grabber I need to get information from a popup, so I wan to use the Sublinks.

    In the HTML code, I have :

    HTML:
    <a href="javascript:var retour=popupSimple('tpl72.htm&cha=239&thema=34782&DID=16591057', 'Inscription', 400, 584);" class="tt-blanc10">
      Photo de couv'
    </a>

    I need to open the page tpl72.htm&cha=239&thema=34782&DID=16591057 to get the description. SO I insert this in my grabber :

    Code:
    <Sublinks>
      <Sublink search="popupSimple" template="Details">
      </Sublink>
    </Sublinks>

    But I have nothing in log file. In the documention I read that sublink find the good <A href="..."> to retrieve information. If I change the text "popupSimple" by "tpl72.htm" I have :

    Code:
    2007-01-04 22:13:26.140625 [Info.][1]: WebEPG: Reading http://www.canalsat.fr/index.php?tpl=217&DAY=04-01-2007&cha=239&tra=5&multi=2 POST: 
    2007-01-04 22:13:27.281250 [Info.][1]: WebEPG: Listing Count 6
    2007-01-04 22:13:27.296875 [Info.][1]: WebEPG: Guide, Program Info: 20070104223500 / 20070104230000 - Naissance à domicile
    2007-01-04 22:13:27.296875 [Info.][1]: WebEPG: SubLink Request http://www.canalsat.fr/index.php?tpl=217&DAY=04-01-2007&cha=239&tra=1&multi=2 POST: 
    2007-01-04 22:13:29.468750 [Warn.][1]: WebEPG: Getting sublinked data failed

    I don't know why but he ask for the current page and not the popup page.

    If it could help : My grabber : http://pastebin.team-mediaportal.com/11695
     

    James

    Retired Team Member
  • Premium Supporter
  • May 6, 2005
    1,385
    67
    Switzerland
    This should have worked, but there was a small bug in the code for Javascript sublinks. I have fixed this and made a few small changes to the template to improve the GENRE parsing.

    Its now in the SVN and should be ready to go :)
     

    DKreeK

    Portal Pro
    October 22, 2004
    351
    0
    44
    Home Country
    France France
    Ok, thank you. I go to test it, add the list of channel and I send you the last version
     

    Users who are viewing this thread

    Top Bottom