home
products
contribute
download
documentation
forum
Home
Forums
New posts
Search forums
What's new
New posts
All posts
Latest activity
Members
Registered members
Current visitors
Donate
Log in
Register
What's new
Search
Search
Search titles only
By:
New posts
Search forums
Search titles only
By:
Menu
Log in
Register
Navigation
Install the app
Install
More options
Contact us
Close Menu
Forums
MediaPortal 1
MediaPortal 1 Plugins
Popular Plugins
Moving Pictures
CSFD scraper script 0.2.3 [CZ]
Contact us
RSS
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Reply to thread
Message
<blockquote data-quote="acmetelka" data-source="post: 814334" data-attributes="member: 116899"><p>Hi guys,</p><p></p><p>first question, what is 0.2.3+ version?</p><p></p><p>And second: my solution so far..</p><p></p><p>Thanks for hints JiRo, I dig a little deeper to the xml, spend some hours to undestand it and made some changes for me.</p><p></p><p>Search node I left without changes. In the get_details node, I changed just section for parsing title from CSFD movie detail page. In the page, there is some property og:title for facebook and it seems to me, that there is always czech title / original title ( or only original, if czech not available ). So I parsed this out and then parsed both titles from it. Made one configuration variable in the script, when I choose, what I want in the title ( cz, ori, firstori, firstcz ). </p><p></p><p>I don't see the aka titles in MP plugin, so I decided to do it this way. </p><p></p><p>Here is the changed script part:</p><p></p><p>[CODE]</p><p>...</p><p></p><p> <!-- Retrieve details --></p><p></p><p> <set name="movie.details_url" value="${site}${movie.site_id}" /></p><p> <retrieve name="details_page" url="${movie.details_url}" encoding="utf-8" retries="10" timeout_increment="3000" allow_unsafe_header="true" /></p><p></p><p> </p><p> <!-- Set variable to prefer original name or czech name from CSFD DB values: cz, ori, firstori, firstcz --></p><p> <set name="pref_title" value="firstori" /></p><p></p><p> <!-- Regular expressions for parsing og:title property from movie detail html page --></p><p> </p><p> <set name="rx_og_title"></p><p> <![CDATA[</p><p> <**** property="og:title" content="(.*?)" /></p><p> ]]></p><p> </set></p><p></p><p> <set name="rx_parse_og_title"></p><p> <![CDATA[</p><p> content="(.*?) / (.*?)\(</p><p> ]]></p><p> </set></p><p></p><p> <!-- OG **** property title --></p><p> <parse name="og_title_all" input="${details_page}" regex="${rx_og_title}" /></p><p> <parse name="title_main" input="${og_title_all}" regex="${rx_parse_og_title}" /></p><p> <parse name="title_ori" input="${title_main[0][1]}" regex="(.+?)(?:, (The|A|An|Ein|El|Das|Die|Der|Les|Un|Une))?[ \t]*$" /></p><p> </p><p> <!-- Accorging to pref_title variable, set movie title --></p><p> </p><p> <if test="${pref_title}=ori"></p><p> <if test="${title_ori[0][0]}="></p><p> <set name="movie.title" value="${title_main[0][0]:htmldecode}" /></p><p> </if></p><p> <if test="${title_ori[0][0]}!="></p><p> <set name="movie.title" value="${title_ori[0][0]:htmldecode}" /></p><p> </if></p><p> </if></p><p> <if test="${pref_title}=cz"></p><p> <set name="movie.title" value="${title_main[0][0]:htmldecode}" /></p><p> </if></p><p> <if test="${pref_title}=firstori"></p><p> <if test="${title_ori[0][0]}="></p><p> <set name="movie.title" value="${title_main[0][0]:htmldecode}" /></p><p> </if></p><p> <if test="${title_ori[0][0]}!="></p><p> <if test="${title_ori[0][0]}=${title_main[0][0]}"></p><p> <set name="movie.title" value="${title_main[0][0]:htmldecode}" /></p><p> </if></p><p> <if test="${title_ori[0][0]}!=${title_main[0][0]}"></p><p> <set name="movie.title" value="${title_ori[0][0]:htmldecode} ( ${title_main[0][0]:htmldecode} )" /></p><p> </if></p><p> </if></p><p> </if></p><p> <if test="${pref_title}=firstcz"></p><p> <if test="${title_ori[0][0]}="></p><p> <set name="movie.title" value="${title_main[0][0]:htmldecode}" /></p><p> </if></p><p> <if test="${title_ori[0][0]}!="></p><p> <if test="${title_ori[0][0]}=${title_main[0][0]}"></p><p> <set name="movie.title" value="${title_main[0][0]:htmldecode}" /></p><p> </if></p><p> <if test="${title_ori[0][0]}!=${title_main[0][0]}"></p><p> <set name="movie.title" value="$${title_main[0][0]:htmldecode} ( ${title_ori[0][0]:htmldecode} )" /></p><p> </if></p><p> </if></p><p> </if></p><p></p><p></p><p> <!-- Title ( original from Trottel, not used) --></p><p> <!-- </p><p> <parse name="titleaa" input="${details_page}" regex="&lt;h1&gt;(.+?)(?:, (The|A|An|Ein|El|Das|Der|Die|Les|Un|Une))?(?:\s&lt;span.+?&lt;/span&gt;)?.*?&lt;/h1&gt;" /></p><p> <set name="movie.title" value="${titleaa[0][1]:htmldecode} ${titleaa[0][0]:htmldecode}" /></p><p> <replace name="movie.title" input="${movie.title}" pattern="( \(TV film\))" with="" /></p><p> --></p><p> </p><p> <!-- Alternate Titles --></p><p></p><p>...</p><p>[/CODE]</p><p></p><p>Attached result in MP.</p><p></p><p>Metelka ( Jindrich )</p></blockquote><p></p>
[QUOTE="acmetelka, post: 814334, member: 116899"] Hi guys, first question, what is 0.2.3+ version? And second: my solution so far.. Thanks for hints JiRo, I dig a little deeper to the xml, spend some hours to undestand it and made some changes for me. Search node I left without changes. In the get_details node, I changed just section for parsing title from CSFD movie detail page. In the page, there is some property og:title for facebook and it seems to me, that there is always czech title / original title ( or only original, if czech not available ). So I parsed this out and then parsed both titles from it. Made one configuration variable in the script, when I choose, what I want in the title ( cz, ori, firstori, firstcz ). I don't see the aka titles in MP plugin, so I decided to do it this way. Here is the changed script part: [CODE] ... <!-- Retrieve details --> <set name="movie.details_url" value="${site}${movie.site_id}" /> <retrieve name="details_page" url="${movie.details_url}" encoding="utf-8" retries="10" timeout_increment="3000" allow_unsafe_header="true" /> <!-- Set variable to prefer original name or czech name from CSFD DB values: cz, ori, firstori, firstcz --> <set name="pref_title" value="firstori" /> <!-- Regular expressions for parsing og:title property from movie detail html page --> <set name="rx_og_title"> <![CDATA[ <**** property="og:title" content="(.*?)" /> ]]> </set> <set name="rx_parse_og_title"> <![CDATA[ content="(.*?) / (.*?)\( ]]> </set> <!-- OG **** property title --> <parse name="og_title_all" input="${details_page}" regex="${rx_og_title}" /> <parse name="title_main" input="${og_title_all}" regex="${rx_parse_og_title}" /> <parse name="title_ori" input="${title_main[0][1]}" regex="(.+?)(?:, (The|A|An|Ein|El|Das|Die|Der|Les|Un|Une))?[ \t]*$" /> <!-- Accorging to pref_title variable, set movie title --> <if test="${pref_title}=ori"> <if test="${title_ori[0][0]}="> <set name="movie.title" value="${title_main[0][0]:htmldecode}" /> </if> <if test="${title_ori[0][0]}!="> <set name="movie.title" value="${title_ori[0][0]:htmldecode}" /> </if> </if> <if test="${pref_title}=cz"> <set name="movie.title" value="${title_main[0][0]:htmldecode}" /> </if> <if test="${pref_title}=firstori"> <if test="${title_ori[0][0]}="> <set name="movie.title" value="${title_main[0][0]:htmldecode}" /> </if> <if test="${title_ori[0][0]}!="> <if test="${title_ori[0][0]}=${title_main[0][0]}"> <set name="movie.title" value="${title_main[0][0]:htmldecode}" /> </if> <if test="${title_ori[0][0]}!=${title_main[0][0]}"> <set name="movie.title" value="${title_ori[0][0]:htmldecode} ( ${title_main[0][0]:htmldecode} )" /> </if> </if> </if> <if test="${pref_title}=firstcz"> <if test="${title_ori[0][0]}="> <set name="movie.title" value="${title_main[0][0]:htmldecode}" /> </if> <if test="${title_ori[0][0]}!="> <if test="${title_ori[0][0]}=${title_main[0][0]}"> <set name="movie.title" value="${title_main[0][0]:htmldecode}" /> </if> <if test="${title_ori[0][0]}!=${title_main[0][0]}"> <set name="movie.title" value="$${title_main[0][0]:htmldecode} ( ${title_ori[0][0]:htmldecode} )" /> </if> </if> </if> <!-- Title ( original from Trottel, not used) --> <!-- <parse name="titleaa" input="${details_page}" regex="<h1>(.+?)(?:, (The|A|An|Ein|El|Das|Der|Die|Les|Un|Une))?(?:\s<span.+?</span>)?.*?</h1>" /> <set name="movie.title" value="${titleaa[0][1]:htmldecode} ${titleaa[0][0]:htmldecode}" /> <replace name="movie.title" input="${movie.title}" pattern="( \(TV film\))" with="" /> --> <!-- Alternate Titles --> ... [/CODE] Attached result in MP. Metelka ( Jindrich ) [/QUOTE]
Insert quotes…
Verification
Post reply
Forums
MediaPortal 1
MediaPortal 1 Plugins
Popular Plugins
Moving Pictures
CSFD scraper script 0.2.3 [CZ]
Contact us
RSS
Top
Bottom