Reply to thread

Try this out:


[CODE]

<action name="search">

   

    <set name="offset" value="0" />

   

    <!-- Regular Expressions -->


    <set name="rx_search_results">

      <![CDATA[

      <a\shref="/kino/(?<movieID>[\d]+)[^<]+[^>]+>(?<movieTitle>[^<]+)[^\n]+\n[^O]+OT..(?<movieOT>[^<]+)[^\d]+(?<movieYear>\d{4})

      ]]>

    </set>


    <!-- Retrieve results using Title -->

    <retrieve name="search_page" url="http://www.cinefacts.de/suche/suche.php?name=${search.title:safe}" />


    <!-- if we got a details page, this is used. if not, regex does not match so we dont process the loop-->

    <parse name="details_page_block" input="${search_page}" regex="${rx_search_results}"/>

        <if test="details_page_block[0][0]!=">

            <loop name="item_return" on="details_page_block">

                  <add name="counter" value1="${count}" value2="${offset}" />

                  <set name="movie[${counter}].title" value="${item_return[1]:htmldecode}"/>

                  <set name="movie[${counter}].alternate_titles" value="${item_return[2]:htmldecode}" />

                  <!-- tests the existance of a year before trying to put on in the movie info -->

                  <if test="${item_return[3]}!=">

                      <set name="movie[${counter}].year" value="${item_return[3]:htmldecode}"/>

                  </if>

              <set name="movie[${counter}].site_id" value="${item_return[0]}"/>

              <set name="movie[${counter}].details_url" value="http://www.cinefacts.de/kino/${item_return[0]}/${item_return[1]}/filmdetails.html"/>

                  <subtract name="movie[${counter}].popularity" value1="100" value2="${counter}"/>

            </loop>

        </if>


  </action>

 

</ScriptableScraper>


[/CODE]


Try not to use .+ or .* That is at least something that I try to avoid.


-LRFalk01


Top Bottom