home
products
contribute
download
documentation
forum
Home
Forums
New posts
Search forums
What's new
New posts
All posts
Latest activity
Members
Registered members
Current visitors
Donate
Log in
Register
What's new
Search
Search
Search titles only
By:
New posts
Search forums
Search titles only
By:
Menu
Log in
Register
Navigation
Install the app
Install
More options
Contact us
Close Menu
Forums
MediaPortal 1
MediaPortal 1 Plugins
Popular Plugins
Moving Pictures
CSFD scraper script 0.2.3 [CZ]
Contact us
RSS
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Reply to thread
Message
<blockquote data-quote="RoChess" data-source="post: 813751" data-attributes="member: 18896"><p>I feel your pain. When imdb.com website decided out of sheer stupidity that if you are in Germany and are viewing imdb.com that you should get the German translated title (if one is available). You can disable this behaviour if you sign up at imdb.com and adjust your profile settings, but the scraper-script is unable to do that.</p><p></p><p>That's when I started writing a system in IMDb+ that would attempt to 'recognize' what makes a title English. And the current method which works 'ok', but is not without mistakes and takes up a very large portion now more then 1/4th of the entire IMDb+ scraper-script. I'm actually expanding it to support more languages (I lost my mind already, so why not <img src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" class="smilie smilie--sprite smilie--sprite8" alt=":D" title="Big Grin :D" loading="lazy" data-shortname=":D" />), but the only proper support I can deal with right now is English, German, French, Spanish, Portugese, Italian, Icelandish, Swedish and Dutch.</p><p></p><p>The only advice I can give you for the CSFD scraper-script, is that inside the search-node you have access to the filtered title and even the filename itself. So you can use this not only as search string to pass onto the CSFD website to locate the movie via alternative titles, but you can also use it to compare the results.</p><p></p><p>Example File = "Puss in Boots (2011).mkv"</p><p></p><p>Search at CSFD with title "Puss in Boots" = <strong><a href="http://www.csfd.cz/hledat/?q=Puss+in+Boots" target="_blank">search results</a></strong></p><p></p><p>Not sure if that's how you do it in your scraper-script, or if you use some API or other method, but that makes no difference for what I'm trying to explain. So your CSFD scraper-script will eventually turn movie[0] into the details of the following movie: <strong><a href="http://www.csfd.cz/film/226213-kocour-v-botach/" target="_blank">Kocour v botách</a></strong>, and put in the AKA title of "Puss in Boots" as well.</p><p></p><p>MovingPictures is then able to match the original filename title with the AKA title and will auto-approve the search results found (and instruct CSFD scraper-script via details-node to get all the info, such as summary, crew, etc). I assume in your CSFD scraper-script that you however used movie[0].title = "Kocour v botách", so that is what MovingPictures will use as title, eventhough it auto-approved via movie[0].alternative_title.</p><p></p><p>However you have full control over what movie[0].title becomes inside the search-node, and you could have used movie[0].title = "Puss in Boots" as well which would have given the English title results.</p><p></p><p>So what you can do, is inside search-node compare title from filename, which is ${search.title}, to the AKA title results and when you find a match then overrule movie[0].title to become the ${search.title} value.</p><p></p><p>Then you get the following results:</p><p></p><p>"Puss in Boots (2011).mkv" becomes "Puss in Boots"</p><p>and</p><p>"Kocour v botach (2011).mkv" becomes "Kocour v botách"</p><p></p><p>You can then also decide to loose your mind and offer these type of 'options' as configurable options to the user, which is what I did and which is what lead to the IMDb+ plugin. So that only if a user has the say for example 'Force English titles if filename matches' setting enabled the above system gets used. Get your CSFD users to 'star' the following issue otherwise <strong><a href="http://code.google.com/p/moving-pictures/issues/detail?id=319" target="_blank">#319</a></strong> so that it will be easier for you to also support configurable options instead of having to write your own plugin (you are more then welcome to use the source code from IMDb+ plugin project though)</p><p></p><p><strong>Remember</strong> that you also have to 'fix' the English title again inside details node, and re-use the ${movie.title} value inside the details-node to verify what title to use. At that moment the ${movie.title} is the same as the one you used as movie[0].title in the search-node. You have to then repeat the verification, and verify if the "${movie.title}" value matches the one found at the details page on the 'US flag' shown title. Should be easy to use some regular expression code to retrieve that, because the flag icon is a fixed anchor you can use.</p><p></p><p>Infact you can use: <img src="[^"]+" alt="USA" />[^<]+<h3>(?<EnglishTitle>[^<]+)</h3></p><p></p><p>Compounded problem then however is that when a user 'refreshes' an existing movie, you could end up forcing them with an English by mistake. So to prevent that, verify if the actor/writer/director fields are empty first before you do the tricks to the title. At least that is how I solved the problem in IMDb+, if you figure out a better method I would be all ears <img src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" class="smilie smilie--sprite smilie--sprite6" alt=":cool:" title="Cool :cool:" loading="lazy" data-shortname=":cool:" /></p></blockquote><p></p>
[QUOTE="RoChess, post: 813751, member: 18896"] I feel your pain. When imdb.com website decided out of sheer stupidity that if you are in Germany and are viewing imdb.com that you should get the German translated title (if one is available). You can disable this behaviour if you sign up at imdb.com and adjust your profile settings, but the scraper-script is unable to do that. That's when I started writing a system in IMDb+ that would attempt to 'recognize' what makes a title English. And the current method which works 'ok', but is not without mistakes and takes up a very large portion now more then 1/4th of the entire IMDb+ scraper-script. I'm actually expanding it to support more languages (I lost my mind already, so why not :D), but the only proper support I can deal with right now is English, German, French, Spanish, Portugese, Italian, Icelandish, Swedish and Dutch. The only advice I can give you for the CSFD scraper-script, is that inside the search-node you have access to the filtered title and even the filename itself. So you can use this not only as search string to pass onto the CSFD website to locate the movie via alternative titles, but you can also use it to compare the results. Example File = "Puss in Boots (2011).mkv" Search at CSFD with title "Puss in Boots" = [b][url=http://www.csfd.cz/hledat/?q=Puss+in+Boots]search results[/url][/b] Not sure if that's how you do it in your scraper-script, or if you use some API or other method, but that makes no difference for what I'm trying to explain. So your CSFD scraper-script will eventually turn movie[0] into the details of the following movie: [b][url=http://www.csfd.cz/film/226213-kocour-v-botach/]Kocour v botách[/url][/b], and put in the AKA title of "Puss in Boots" as well. MovingPictures is then able to match the original filename title with the AKA title and will auto-approve the search results found (and instruct CSFD scraper-script via details-node to get all the info, such as summary, crew, etc). I assume in your CSFD scraper-script that you however used movie[0].title = "Kocour v botách", so that is what MovingPictures will use as title, eventhough it auto-approved via movie[0].alternative_title. However you have full control over what movie[0].title becomes inside the search-node, and you could have used movie[0].title = "Puss in Boots" as well which would have given the English title results. So what you can do, is inside search-node compare title from filename, which is ${search.title}, to the AKA title results and when you find a match then overrule movie[0].title to become the ${search.title} value. Then you get the following results: "Puss in Boots (2011).mkv" becomes "Puss in Boots" and "Kocour v botach (2011).mkv" becomes "Kocour v botách" You can then also decide to loose your mind and offer these type of 'options' as configurable options to the user, which is what I did and which is what lead to the IMDb+ plugin. So that only if a user has the say for example 'Force English titles if filename matches' setting enabled the above system gets used. Get your CSFD users to 'star' the following issue otherwise [b][url=http://code.google.com/p/moving-pictures/issues/detail?id=319]#319[/url][/b] so that it will be easier for you to also support configurable options instead of having to write your own plugin (you are more then welcome to use the source code from IMDb+ plugin project though) [b]Remember[/b] that you also have to 'fix' the English title again inside details node, and re-use the ${movie.title} value inside the details-node to verify what title to use. At that moment the ${movie.title} is the same as the one you used as movie[0].title in the search-node. You have to then repeat the verification, and verify if the "${movie.title}" value matches the one found at the details page on the 'US flag' shown title. Should be easy to use some regular expression code to retrieve that, because the flag icon is a fixed anchor you can use. Infact you can use: <img src="[^"]+" alt="USA" />[^<]+<h3>(?<EnglishTitle>[^<]+)</h3> Compounded problem then however is that when a user 'refreshes' an existing movie, you could end up forcing them with an English by mistake. So to prevent that, verify if the actor/writer/director fields are empty first before you do the tricks to the title. At least that is how I solved the problem in IMDb+, if you figure out a better method I would be all ears :cool: [/QUOTE]
Insert quotes…
Verification
Post reply
Forums
MediaPortal 1
MediaPortal 1 Plugins
Popular Plugins
Moving Pictures
CSFD scraper script 0.2.3 [CZ]
Contact us
RSS
Top
Bottom