Reply to thread

Hi,


i try to make a scraper for cinefacts.de. i´m on my first attemp, so there should be only the search thing in it.


My regex is fine for the search site, but i can't get the movie search to work. it only finds one movie title if i wrote zufaellig verheiratet not for Zufällig verheiratet and the date is always (9999). Maybe one of you guys could take a look in it and try, to tell me what´s wrong and how to setup this.


Here' the code:[CODE] 


<action name="search">

   

    <set name="offset" value="0" />

   

    <!-- Regular Expressions -->


    <set name="rx_search_results">

      <![CDATA[

      <a href="/kino/(?<movieID>.+)/(?<movieAKA>.+)/filmdetails.html">\s+<b title="(?<movieTitle>.+?)"\s.+\s+\D+(?<movieYear>\d{4})

      ]]>

    </set>


    <!-- Retrieve results using Title -->

    <retrieve name="search_page" url="http://www.cinefacts.de/suche/suche.php?name=${search.title:safe}" />


    <!-- if we got a details page, this is used. if not, regex does not match so we dont process the loop-->

    <parse name="details_page_block" input="${search_page}" regex="${rx_search_results}"/>

        <if test="details_page_block[0][0]!=">

            <loop name="item_return" on="details_page_block">

              <add name="counter" value1="${count}" value2="${offset}" />

                  <set name="movie[${counter}].title" value="${item_return[2]:htmldecode}"/>

                  <set name="movie[${counter}].alternate_titles" value="${item_return[1]:htmldecode}" />

                  <!-- tests the existance of a year before trying to put on in the movie info -->

                  <if test="${item_return[3]}!=">

                      <set name="movie[${counter}].year" value="${item_return[3]:htmldecode}"/>

                  </if>

              <set name="movie[${counter}].site_id" value="${item_return[0]}"/>

              <set name="movie[${counter}].details_url" value="http://www.cinefacts.de/kino/${item_return[0]}/${item_return[1]}/filmdetails.html"/>

                  <subtract name="movie[${counter}].popularity" value1="100" value2="${counter}"/>

            </loop>

        </if>


  </action>

 

</ScriptableScraper>[/CODE]


Muchas gracias


Schenk


Top Bottom