Adult movie scraper (details+covers) -- v2.0.7

Discussion in 'Moving Pictures' started by RoChess, December 17, 2009.

  1. RoChess
    • Premium Supporter

    RoChess Extension Developer

    Joined:
    March 10, 2006
    Messages:
    4,151
    Likes Received:
    1,293
    Ratings:
    +1,656 / 2
    The CDUniverse scraper-script will import your adult collection with as much details as possible, and get high resolution front and back cover artwork.

    To install the scraper (thanks Mew):

    1. Download the .xml file (attachment at bottom of this post) to a location you remember.
    2. Open "MediaPortal - Configuration", go to the "plugins" and select "Moving Pictures" and "Config".
    3. Select the "Importer Settings" tab.
    4. In the "Data Sources:" section select the "Manually manage movie data sources" radio button.
    5. Click the "Movie Details Data Sources" button.
    6. In the popup click the arrow just to the right of the "+" button and pick "Add a New Data Source".
    7. Browse to the new .xml scraper file you downloaded in first step and click OK.
    8. It should appear as 'CDUniverse.com' in the "Source column". You may need to enable it by pressing the "+" button if it is greyed out, and move it to the position in the list that you prefer priority wise.
    Note:It is not uncommon for the adult movie industry to use titles that match an actual IMDb movie title very closely. If you do not put this scraper at the highest priority then you might never see the results. You can also fix this another way; please goto the "About" tab -> "Advanced Settings" -> "Matching and Importing" -> "Minimum Possible Match Threshold" and adjust this to '0'. This will cause every internet scraper to work, so disable all the ones you do not use. Placing this adult scraper at the very top is needed for auto-approval to function, as only the first scraper is used for this. So you might want to run it like that for the initial import of your adult collection.

    NEW: You can now prefix your filenames with '$$$$$XXX-'. This will freak out almost all of the other scraper-scripts, so that they will not give any results at all. That in turn will allow the CDUniverse scraper-script in a lower priority position to be the only one to come up with a match. You will see 'xXx ' prefixed during the search-node stage when this happens (green circle appears), but during the details-node stage this will then be corrected again (green circle gets the white checkmark).


    Technical details on scraper:

    • CDUniverse.com is used as source, yielded the best results.
    • Both the front and back covers are obtained (you can switch from within GUI), with the front cover setup as default. (This is now no longer possible :( and just a small front cover is obtained)
    • There is no support for Release Year to assist in finding a match so to make the title results more relevant the recent ones are listed first which hopefully helps in the auto-approval rate of your collection. -- Adjusted in v1.0.2
    • Blu-Ray movies are now supported (mainly for cover images) and are shown with (Blu-Ray) in the title. This will interfere with a title match, so DVD titles are still used by default. Either add the "(Blu-Ray)" string to your filenames, or simply overrule the drop down box selection to pick the BR version (after the importer is done, or it can lead to a crash).
    • Folder names are now used to try to find a positive match, if the filename is one of those cryptic aconyms from a scene group.
    Known issue:

    • CDUniverse adds 'DVD' to the end of every title. Don't know if it adds any other strings as my collection didn't span any of those. I have no clue how to fix the added postfix, so any help in this matter would be appreciated. -- Fixed in v1.0.1
    • PID support will not be added, because then backdrop support would be gone. Instead rename your file(s) to match the title that a manual search on CDUniverse gives you, and if that still fails, then please provide me with the filename.
    Changelog (February 15th 2014):



    • v2.0.0 - Fixed search node to find results again after CDUniverse changed their HTML code once more. Adjusted site_id to be the 7-digit product number only to make it easier on Follw.it developers, but added in code to retain backwards compatability with v1.x method. Also improved the cover node to obtain both front and back covers faster.
    • v2.0.1 - Search node fixed again to compensate for changes made by CDUniverse.
    • v2.0.2 - Internal tests on changes CDUniverse was working on, but kept undoing.
    • v2.0.3 - Internal tests on changes CDUniverse was working on, but kept undoing.
    • v2.0.4 - Seems CDUniverse finally decided to roll out the new HTML code, so this one fixes artwork, summary and auto-matches more titles.
    • v2.0.5 - Added support for custom filename prefix '$$$$$XXX-', so that CDUniverse can be placed in 2nd or lower scraper-script position and still be the only one to find a match.
    • v2.0.6 - Compensated for new HTML code and lost ability to get large front+back covers due to a referral block on their servers. Will have to settle for front cover thumbnail now.
    • v2.0.7 - Trying to bypass bot-detection, but it is hardcoded so it will probably not last long
    Enjoy.
     

    Attached Files:

    Last edited: February 15, 2014
    • Like Like x 8
  2. Google AdSense Guest Advertisement



    to hide all adverts.
  3. RoChess
    • Premium Supporter

    RoChess Extension Developer

    Joined:
    March 10, 2006
    Messages:
    4,151
    Likes Received:
    1,293
    Ratings:
    +1,656 / 2
    Re: Adult movie scraper (details+covers) -- v1.0.1

    Release of v1.0.1

    Since nobody commented, I'm going to assume that there have been no problems found. New features have been added, so be sure to rescan your existing collection with this new scraper version to refresh the database fields.
     
  4. clahti

    clahti Portal Member

    Joined:
    November 19, 2008
    Messages:
    27
    Likes Received:
    2
    Ratings:
    +2 / 0
    Re: Adult movie scraper (details+covers) -- v1.0.1

    LOL so I don't think anyone wants to admit they are using your script. I have not had a chance to check this out myself but I will soon and let you know what I find :D
     
    • Like Like x 1
  5. xcfalcon351

    xcfalcon351 Portal Member

    Joined:
    July 25, 2009
    Messages:
    7
    Likes Received:
    1
    Location:
    Forrestfield, WA
    Ratings:
    +1 / 0
    Home Country:
    Re: Adult movie scraper (details+covers) -- v1.0.1

    Great Work. Really helps organise things accross my collection.

    Seems to have trouble picking up correct names for movies from the site...I have my collection organised into individual sub-directories for each movie, and although I can find the correct title on the cduniverse site(often by simly cutting and pasting the directory name into the site search), The scraper seems to have difficulty, even if I enter a custom search string.

    Not sure if it's possible, but maybe there is a way of having the scraper search the CD Universe Part Number (or PID) as a last resort as there are a lot of movies which I know are listed on the site, but cannot be found with the scraper.

    Also, are you able to tell me which URL the cover downloader is looking for so I can manually point it in the right direction for a few movies. ie. do you point it to the main page of a movie or open the covers first and use that. I have tried both and don't seem to be able to get it to work.

    Keep up the good work on this one. I know it will be appreciated, even if it seems most people out there won't admit to using it.
    :D

    Not sure what's involved, but the following URL might give better/more complete results;

    Search Extreme: The Joy of Search

    If I knew how I would try adding this to your scraper or re-writing...anyone want to take a stab at it?
     
  6. RoChess
    • Premium Supporter

    RoChess Extension Developer

    Joined:
    March 10, 2006
    Messages:
    4,151
    Likes Received:
    1,293
    Ratings:
    +1,656 / 2
    Re: Adult movie scraper (details+covers) -- v1.0.1

    I use the details page for the cover as well. If you manually find a movie at CDUniverse.com you will see it has a small cover image and then a link below it "Large front" and "Large back". With the scraper I follow those links and then copy the URL to the larger image. If you find cover images elsewhere that you prefer, simply get the URL on the image (h**p://example.com/image.you.prefer.jpg) and enter it directly into the movie details (bottom left area is for the covers you have the option to add image from disk or URL, the MovPic plugin will then getch that image and make a copy in the thumbs folder).

    As for mismatches or lack of title listing, the blame for that unfortunatly lies at the crappy search engines that those adult sites have. They might have the exact same title as my search string at CDUniverse.com and it will be listed as the 11th movie, meaning it won't make it into the Moving Picture listing. To compensate for that I adjusted the search parameters to list the most recent movies first. This seemed to be the solution for the relative small sample collection I have, as 12 out of 14 movies auto completed and the other 2 had the correct title available in the drop down box for manual verification. This matches about the same auto-approval rating I get with the IMDb.com scraper, so I didn't bother fixing it.

    So please provide me with a list of your folder+filename structure that failed to be listed at all, so that I can just rename some dummy files on my system into that name, and see where the problem lies. Send me a list of those filenames in an attachment, or private message, seeing as the nature of some filenames might not be suitable for this public thread.

    To generate the list of folders+filenames that I need use the following exactly as is (modified to your own "d:\movies" folder of course): WinKey+R -> cmd /c dir "d:\movies" /b/ogn/s > "%HomeDrive%%HomePath%\Desktop\Adult_List.txt"

    This will place "Adult_List.txt" on your dekstop, so please attach that to a reply.

    Based on the results I'm able to verify with your filenames, I might decide to go with an idea I had before and that is to use Google.com combined with the "site:cduniverse.com" option. Normally I try to avoid Google for certain searches, because they know enough about me already, but they do yield fantastic search results.

    As for your other adult site suggestion; I was unable to find a positive match on a lot of movies in my own collection. It's weird, because in my case I got the 9th sequel to a certain title and eventhough they list the original and 5 sequels, they lack info on the recent 3 movies. It seems CDUniverse has the most complete collection, so it makes more sense to use that one. If you can support your reasoning more I might be presuaded, but I think focussing on fixing the CDUniverse.com scraper would yield the best results.

    And thank you for the support. I've never been good at Regular Expressions, and always relied on others when I ran into them. So once I saw how much these scrapers rely on them, combined with mascot's request, and the fact that it would be nice on my own collection, I decided to give it a go. At least this way my learning benefits a few people who might find this useful, but I knew ahead of time that not many would want to admit on using it :D
     
  7. xcfalcon351

    xcfalcon351 Portal Member

    Joined:
    July 25, 2009
    Messages:
    7
    Likes Received:
    1
    Location:
    Forrestfield, WA
    Ratings:
    +1 / 0
    Home Country:
    Re: Adult movie scraper (details+covers) -- v1.0.1

    Thanks for your quick reply to my queries

    The reason I was wondering which url you were using is because I tried manually entering the large front and back URL's so I could perhaps sort my collection that way however the program didn't seem to want to capture the .jpg's from the page...probably doing something wrong my end so wil keep playing with that one.

    My auto-approval ratio was more like 1:10 with well over half not coming up with the correct title in the drop down box. I went through and manually checked the titles I have with CDUniverse and "most" of them were listed. Is it possible to increase the number of options in the drop down box? maybe that's a question for the moving pictures guys...

    I've attached my embarassingly long file and directory structure for the movies I'm trying to sort. This is the complete list, so some of these were found OK and some weren't. Hopefully it's of some use to you in further development.

    I took another look at that site I suggested, and while it seemed better initially, once I put my whole collection through it I can see what you mean. CDUniverse does seem to be a more complete listing:oops:

    Keep up the good work and thanks again!
     

    Attached Files:

    • AdultList.txt
      File size:
      30.6 KB
      Uploaded:
      December 29, 2009
      Views:
      2,388
  8. RoChess
    • Premium Supporter

    RoChess Extension Developer

    Joined:
    March 10, 2006
    Messages:
    4,151
    Likes Received:
    1,293
    Ratings:
    +1,656 / 2
    Re: Adult movie scraper (details+covers) -- v1.0.1

    Ok, it seems the main issue is that eventhough your foldernames can be matched to the right title, the actual filename inside the folder is often a cryptic one. It is this filename that my scraper ends up using, so I have to figure out if I can get the foldername from MovingPictures and use that when a search on the filename yields no usable results.

    Re-examining the CDUniverse HTML output, I did noticed a way in which I should be able to improve the auto-approval results.

    So I'll try to figure out if I can obtain the foldername to use for the scraper search, but if unable to do so, you'll have to resort to renaming all your movie files. Multi-part files are no problems, those are dealt with before they are passed on to the scraper. So "Some Title CD1.avi" and "Some Title CD2.avi" result in a multi-part "Some Title" search string with both movies linked as multi-parts.

    The multi-part keywords are: CD#, DVD#, DISC#, DISK#, PART# or "(# of #)", where # is a letter or number.

    But wait on renaming all your files, until I know for sure it is a lost cause to obtain the foldername inside the scraper script.

    I'll keep you posted.
     
  9. xcfalcon351

    xcfalcon351 Portal Member

    Joined:
    July 25, 2009
    Messages:
    7
    Likes Received:
    1
    Location:
    Forrestfield, WA
    Ratings:
    +1 / 0
    Home Country:
    Re: Adult movie scraper (details+covers) -- v1.0.1

    I thought that might have been an issue myself and was going to rename all my files, however quite often the scraper was unable to find a particular title, even if I cut and pasted it directly from the cduniverse site to the custom search string section so I haven't bothered as yet. Is there any way to get the scraper to look for the individual PID number for each title? ie so that if worst came to worst you could maunally enter this number into moving pictures in the same way you do with IMDB numbers?
     
  10. RoChess
    • Premium Supporter

    RoChess Extension Developer

    Joined:
    March 10, 2006
    Messages:
    4,151
    Likes Received:
    1,293
    Ratings:
    +1,656 / 2
    Re: Adult movie scraper (details+covers) -- v1.0.1

    Since you do not mind renaming all your files, I will release the v1.0.2 update that will yield better search results.

    As for the PID, I am sure that is possible, however I'm not that well-versed in this scraper stuff yet. I pretty much copy and pasted this one together, so that will have to wait :D I have a rough idea on how to go about it, so I might be able to figure it out.

    I'm actually pretty confident that you won't even need the PID option with v1.0.2, but I haven't been able to test it yet. The wife is hogging the HTPC :mad:
     
  11. xcfalcon351

    xcfalcon351 Portal Member

    Joined:
    July 25, 2009
    Messages:
    7
    Likes Received:
    1
    Location:
    Forrestfield, WA
    Ratings:
    +1 / 0
    Home Country:
    Re: Adult movie scraper (details+covers) -- v1.0.1

    Sounds good to me :)
    I'll start renaming files ASAP so I can give it a go once you have it ready.

    I know the feeling, mine is doing exactly the same thing. Luckily all my "adult" movies are on my desktop at the other end of the house, so I'm quite happy if she stays where she is.

    Look forward to hearing from you once 1.0.2 is up.

    Cheers
     
Loading...

Users Viewing Thread (Users: 0, Guests: 0)

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice
  • About The Project

    The vision of the MediaPortal project is to create a free open source media centre application, which supports all advanced media centre functions, and is accessible to all Windows users.

    In reaching this goal we are working every day to make sure our software is one of the best.

             

  • Support MediaPortal!

    The team works very hard to make sure the community is running the best HTPC-software. We give away MediaPortal for free but hosting and software is not for us.

    Care to support our work with a few bucks? We'd really appreciate it!