IMDb scraper request with Short summary (1 Viewer)

ysmp

Design Group
  • Team MediaPortal
  • May 17, 2008
    1,863
    744
    Seoul.
    Home Country
    South Korea South Korea
    hi RoChess ! how are you ?

    may i ask you for the update version of the script , to get with short sumry .....as all time ...:)

    and if can also to chang 2 more things (if posible ..)

    i will like to get :

    * only 1 director
    * only 1 writer

    Thank you...:) :D
     

    RoChess

    Extension Developer
  • Premium Supporter
  • March 10, 2006
    4,434
    1,897
    hi RoChess ! how are you ?
    may i ask you for the update version of the script , to get with short sumry .....as all time ...:)
    and if can also to chang 2 more things (if posible ..)
    i will like to get :

    * only 1 director
    * only 1 writer

    Thank you...:) :D

    The other thread is meant for the default IMDb scapers, so I moved you over, otherwise things get confusing.

    The MovingPictures plugin will offer the option later to control to some extend what the scraper does, so you will be able to configure that you prefer short summaries over long ones, one director/writer over all, etc.

    Until that update is done, I will teach you how to do it yourself.

    [collapse]
    Open a copy of the scraper in notepad (you don't wanna loose original incase you make a mistake).

    If you have already installed the scraper you are about to edit, then you need to fix the header information.

    Code:
        <version major="1" minor="5" point="..." />
        <published month="04" day="..." year="2010" />

    Make sure the "..." parts in the above section are higher then the one you already installed, or upgrade will fail.

    Then locate and delete the following section:

    Code:
          <!-- Plot Summary -->
          <retrieve name='summary_page' url='http://www.imdb.com/title/${movie.site_id}/plotsummary'/>
          <parse name="summary" input="${summary_page}" regex="${rx_plot}"/>
          <set name="summary_clean" value="${summary[0][0]:striptags}" />
          <set name="movie.summary" value="${summary_clean:htmldecode}" />

    As this gets the long summary, and the code following that deals with the short summary when no long summay exists. So by deleting this section you end up with short summary only.

    For single director, you want to eliminate the 'loop' that gets all the directors, so locate:

    Code:
          <!-- Directors -->
          <parse name="directors_block" input="${details_page}" regex='&lt;h5&gt;Director[s]?:&lt;/h5&gt;.*?&lt;/div&gt;'/>
          <parse name="directors" input="${directors_block}" regex='&lt;a href="/name/nm\d{7}/"[^&gt;]*&gt;([^&lt;]+)&lt;/a&gt;'/>
          <set name='movie.directors' value=''/>
          <loop name='currDirector' on='directors'>
            <set name="movie.directors" value="${movie.directors}|${currDirector[0]:htmldecode}"/>
          </loop>

    and change it into:

    Code:
          <!-- Directors -->
          <parse name="directors_block" input="${details_page}" regex='&lt;h5&gt;Director[s]?:&lt;/h5&gt;.*?&lt;/div&gt;'/>
          <parse name="directors" input="${directors_block}" regex='&lt;a href="/name/nm\d{7}/"[^&gt;]*&gt;([^&lt;]+)&lt;/a&gt;'/>
          <set name="movie.directors" value="|${directors[0][0]:htmldecode}|"/>

    The same for writers, locate:

    Code:
          <!-- Writers -->
          <parse name="writers_block" input="${details_page}" regex="${rx_writers_block}" />
          <parse name='writers' input="${writers_block}" regex='&lt;a href="/name/nm\d+/"[^&gt;]*&gt;([^&lt;]+)&lt;/a&gt;'/>
          <set name='movie.writers' value=''/>
          <loop name='currWriter' on='writers'>
            <set name='movie.writers' value='${movie.writers}|${currWriter[0]:htmldecode}'/>
          </loop>

    and change it into:

    Code:
          <!-- Writers -->
          <parse name="writers_block" input="${details_page}" regex="${rx_writers_block}" />
          <parse name='writers' input="${writers_block}" regex='&lt;a href="/name/nm\d+/"[^&gt;]*&gt;([^&lt;]+)&lt;/a&gt;'/>
          <set name='movie.writers' value='|${writers[0][0]:htmldecode}|'/>

    Then save file, and import. For instructions on how to import a new scraper, use FAQ.[/collapse]

    Enjoy.
     

    ysmp

    Design Group
  • Team MediaPortal
  • May 17, 2008
    1,863
    744
    Seoul.
    Home Country
    South Korea South Korea
    hi RoChess ! thank you so mach , now i can do it my self .... that's great ...:)

    it's great to hear the plugin will give us some option to control some things in the script ....

    any way , thanks again ...:D

    after i finish and test it i will post it her in case more user wont to use it ..

    Edit : did it ...very eazy to do ...:) thank you ...

    This scrept mod to use Short Summary and only 1 director and 1 Writer , Please note that.
     

    Users who are viewing this thread

    Top Bottom