SQLiteDatabase Plugin for MP2 (1 Viewer)

breese

Retired Team Member
  • Premium Supporter
  • July 11, 2011
    3,902
    770
    66
    Arlington Heights, Illinois
    Home Country
    United States of America United States of America
    Sorry, yes. All my music is on f:\Music\

    So I was looking at a few things and managed find
    1st scan while installed on C: DB file 3,536,668
    2nd scan while installed on F: DB file 3,221,188

    So I exported the info for every song in each album and place the info into folders with Artits names.
    I also looked up a very large album I have call 500 Classic Rock Songs (yep, all in 1 folder)
    I included the export of that album and looking thur the logs found it took 1hr 38 minutes to scan
    Every album in my collection was scanned with Tag&Rename and has a cover jpg named folder.jpg

    If there is anything else you want, please let me know...
     

    Attachments

    • Music.zip
      55.6 KB

    MJGraf

    Retired Team Member
  • Premium Supporter
  • January 13, 2006
    2,478
    1,385
    ok, thanks again, breese!
    If all your music is on f:\Music\... then you most likely chose the "wrong" Resource Provider when you added your share. I suppose you took the "NetworkNeighborhoodResourceProvider" - which also works as you see, but I suppose that it is slower than the "LocalFileSystemResourceProvider".
    If you can't get enough of imports (as I currrently do :D ) try again with that one. Should be much faster...
     

    breese

    Retired Team Member
  • Premium Supporter
  • July 11, 2011
    3,902
    770
    66
    Arlington Heights, Illinois
    Home Country
    United States of America United States of America
    I got all the info I needed, backup up everything as Test2 and uninstalled.
    Just did a new install to C: But.. Decided to test 2 birds with 1 stone.
    Stopped the server, edited the Paths.xml to point the SQLite DB to R:\MediaPortal folder.
    Restared the server, configured the server (thru the client) for music to local F:\Music
    Server and client instantly matched up and the scan started with the DB files on F: as I wanted...

    Good New,
    The veriations in the Paths.xml work great
    Sending the DB to my RamDrive R: is G.D. Fast!!! that 500 Classic Rock Song folder was completed in 20 Seconds!!
    That is not a joke!! I/O has a huge difference on the scan's
    More to come!
     

    MJGraf

    Retired Team Member
  • Premium Supporter
  • January 13, 2006
    2,478
    1,385
    :D That's hardcore testing then - the only remaining question is: What do you do with your Ram-Drive when you restart the system? Dump to disk? But this won't help you in case of a power loss...
     

    breese

    Retired Team Member
  • Premium Supporter
  • July 11, 2011
    3,902
    770
    66
    Arlington Heights, Illinois
    Home Country
    United States of America United States of America
    I am only using the RamDrive for testing to get an idea of process's and Drive I/O
    So far the Raid5 F: is Very fast with reading everything...
    Of course the DB being on R: is Real Fast!!
     

    MJGraf

    Retired Team Member
  • Premium Supporter
  • January 13, 2006
    2,478
    1,385
    Some more test results:

    In the following I used a regular path and file name for the database file name again (instead of an URI). This means that Shared Cache mode is switched off. Also SQLiteTransactions are always disposed. And of course we use our own ConnectionPool.

    Page Size = 4KB / Autocheckpoint every 8192 pages
    Import Test: 22:08
    Page Size = 4KB / Autocheckpoint every 16384 pages
    Import Test: 21:30
    Page Size = 4KB / Autocheckpoint every 32768 pages
    Import Test: 20:59

    Page Size = 8KB / Autocheckpoint every 8192 pages
    Import Test: 21:41
    Page Size = 8KB / Autocheckpoint every 16384 pages
    Import Test: 21:22
    Page Size = 8KB / Autocheckpoint every 32768 pages
    Import Test: 20:55

    Page Size = 16KB / Autocheckpoint every 8192 pages
    Import Test: 22:40
    Page Size = 16KB / Autocheckpoint every 16384 pages
    Import Test: 21:51

    So for 4KB page size, Autocheckpoint=32768 seems to be the best (i.e. max wal-file size is 128MB)
    For 8KB page size, we are about at the same speed for Autocheckpoint=32768 - which means however, that the maximal size of the wal file would be 256MB. I would consider this a bit too much and would therefore prefer to go with Autocheckpoint=16384 - assuming that 8KB page size has any advantage for read speed.
    16KB page size is in general less ideal than 4KB and 8KB regarding write speed. So if we don't find it much faster for reads, this one is out of the game...

    For the read tests, I will test 4KB, 8KB and also 16KB page size - the autocheckpoint value has no impact on pure read speed so I will use the respective best value determined for each of the options above.

    But now I'm facing a problem with my read tests. I have a problem in getting repeatable and reliable results and I suppose this is caused by the Windows OS read cache. Sometimes the first read is really slow, but sometimes the first read is nearly as fast as the second. So I suppose that Windows has in the latter case already read the database file into memory.
    I'm not yet clear how we deal with this. Maybe use some utility (don't remember the name right now - started with "RAM...") to clear the Windows OS cache before starting up the server. We will see...

    Finally I understood how to hook into SQLite to get the internal log messages of SQLite. I will probably also introduce some optional trace logging. Let's see what information it can give us...
     

    breese

    Retired Team Member
  • Premium Supporter
  • July 11, 2011
    3,902
    770
    66
    Arlington Heights, Illinois
    Home Country
    United States of America United States of America
    My current scan is Standard Install but the DB file was configured to save to R:\MediaPortal (RamDrive)
    Note: As fast as the RamDrive is, it is on schedule for a 6 and half hr scan build as I have seen with a Standard SSD install.

    What is to be noted: Each time the Wal process clears out, I see a number (ok.. Lot of Wal) process's in the start up in Resourse Monitor.. As the Wal gets up to 4209kb range, resource monitor drops to 3 or 4 wal showing in the monitor.

    At the pace it is going, it appears the scan times are Read related, not Write because the RamDrive is Much faster and less I/O then any physical drive...

    When the scan completes I will more info... But it is getting late here in the states and I might not have results till the AM...
     

    Attachments

    • Meny Wal.jpg
      Meny Wal.jpg
      543.4 KB

    MJGraf

    Retired Team Member
  • Premium Supporter
  • January 13, 2006
    2,478
    1,385
    Oh and one more thing:
    When I disable our SQLiteDebug.log (which gets quite large during an import), I get:

    Page Size = 4KB / Autocheckpoint every 32768 pages / no SQLDebug.log
    Import Test: 20:13

    That saves us another 42 Seconds on the import, but I have to admit that I thought the difference would be bigger because it writes a > 200MB log file during the import. Our logger seems to be quite efficient. I will switch it on again for further tests to stay consistent...
     

    breese

    Retired Team Member
  • Premium Supporter
  • July 11, 2011
    3,902
    770
    66
    Arlington Heights, Illinois
    Home Country
    United States of America United States of America
    With MP2 saving my DB files to my RamDrive it took almost 13hrs to complete

    So, here is an interesting fact.
    1st scan: Installed on SSD C: 6 and half hrs: DB file 3,536,668 kb
    2nd scan: Installed on Raid5 F: Over 24hrs: DB file 3,221,188 kb
    3rd scan: Installed on SSD C: configured to save DB to RamDrive R: 14hrs: DB file 3,218,692 kb

    While scans 2 and 3 are closer in size than 1, there appears to be a discrepency in size.
    No files were changed in my Music folder during any of these tests

    Idea:
    Why not import only what is needed on an initial scan like
    file name
    directory

    Hold off on importing everything else until a user opens the Album, compair the info to the DB and update it.
    Speeds up initial scan, keeps the database to a minimum, and allows the system to auto update as the user open's / access's an album

    Thoughts?
    btw: the logs I uploaded are complete and Server Log does include the SQLDebug
     

    MJGraf

    Retired Team Member
  • Premium Supporter
  • January 13, 2006
    2,478
    1,385
    Hi everyone,

    breese has given me so much information that I can hardly cope with it :D Thanks for that - very interesting results.
    But before I get into a deep analysis of this (real live is quite stressful at the moment) I wanted to report about some more things I noticed while playing with SQLite.

    First of all, I have implemented a VERY verbose additional logging possibility. It hooks into the System.Data.SQLite logging facility and creates a new SQLiteDebug.log file. I started an import while I was logging and stopped it 2 minutes later because the log file was already > 250MB. So this is really just for debugging purposes. I therefore added a new setting (EnableTraceLogging) which is off by default. The logger reports all errors that may come directly from SQLite (i.e. the C native library) as well as all information logged by System.Data.SQLite (i.e. the C# wrapper). This is interesting because you not only see the commands sent by MP2, but also the commands that are transparently generated by System.Data.SQLite (e.g. if you dispose a SQLiteTransaction object, you can see that this triggers a "ROLLBACK" to the database; or instantiating a new SQLiteTransaction object issues a "BEGIN IMMEDIATE" to the database because we use isolationmode.serializable, etc.). So this is very interesting information and I even plan to make it more verbose. In SQLite you can start every "regular" SQL command with "EXPLAIN QUERY PLAN ...", which does not actually execute the following query, but just tells you what it would do exactly, if your query was executed. This information is even more interesting (see below) and I will try to get that into the SQLiteDebug.log as well.

    What I did with that information so far is the following: As you can read some pages before in this thread, our ReadTest 2 (i.e. just entering the audio section in MP2 Client) has a very long QueryTime. Actually the QueryTime is a big part of the whole ReadTime, which is not the case for all other queries (there reading takes much more time than the query itself). I wanted to find out why that's the case.

    As a recap, the SQL query, which is executed when you enter the audio section is the following:
    Code:
    SELECT
      COUNT(
      V.C
    ) C,
      V.A0
    FROM (
      SELECT
      DISTINCT T1.MEDIA_ITEM_ID C,
      T0.ALBUM A0
    FROM M_PROVIDERRESOURCE T1
      INNER JOIN M_MEDIAITEM T2 ON T2.MEDIA_ITEM_ID = T1.MEDIA_ITEM_ID
      INNER JOIN M_AUDIOITEM T0 ON T0.MEDIA_ITEM_ID = T1.MEDIA_ITEM_ID
    WHERE
      (
      T1.SYSTEM_ID = @V0
      OR T1.SYSTEM_ID = @V1
    )
    ) V
    GROUP BY
      V.A0
    -------------------------------------------------------
    "V0" [String]: '62133952-2a3d-4e55-8b9c-cc3fc52b4d14'
    "V1" [String]: 'cbe4e0ac-a21d-4d18-be63-7bf6b27cb13a'
    (actually this exact query is executed twice - which I already discovered as a likely bug, still have to put this into Jira so that it is not forgotten...)

    Now I put "EXPLAIN QUERY PLAN" in front of this query and got the following result:
    Code:
    1|0|0|SEARCH TABLE M_PROVIDERRESOURCE AS T1 USING INDEX IDX_0 (SYSTEM_ID=?)
    1|0|0|EXECUTE LIST SUBQUERY 2
    1|1|1|SEARCH TABLE M_MEDIAITEM AS T2 USING COVERING INDEX IDX_6 (MEDIA_ITEM_ID=?)
    1|2|2|SEARCH TABLE M_AUDIOITEM AS T0 USING INDEX IDX_25 (MEDIA_ITEM_ID=?)
    1|0|0|USE TEMP B-TREE FOR DISTINCT
    0|0|0|SCAN SUBQUERY 1 AS V
    0|0|0|USE TEMP B-TREE FOR GROUP BY
    Those of you, who are interested enough, can read what that means in detail here.

    The important part now is "USE TEMP B-TREE...". This means not less than SQLite creates a temporary index to perform the "DISTINCT" operation and an additional temporary index to perform the "GROUP BY" operation. This not only explains why breese is seeing a lot of temporary files (these temporary indices are actually written to temporary files by SQLite), it also explains why this query takes so long, because creating a temporary index is extremely expensive.

    But why does it create those indices? To be honest, I don't really know...

    First of all, the "DISTINCT" keyword in the context of this query doesn't make sense in my opinion. We want all the distinct rows from "T1.MEDIA_ITEM_ID C". But I hope there are not multiple songs WITH THE SAME MEDIA_ITEM_ID in one album. The MEDIA_ITEM_ID should be unique so that to my understanding the results of this query are already unique without using DISTINCT. If we leave out the "DISTINCT" keyword, we should get exactly the same query result, but the query plan is reduced by "1|0|0|USE TEMP B-TREE FOR DISTINCT".

    Secondly, I don't know why it uses a temp index for GROUP BY V.A0. V.A0 is just our album name - and there is already an index created by MP2 for the album name. Most likely, SQLite doesn't recognize this, because of the subquery. Maybe this changes if I perform an "ANALYZE" on the database, which is said to analyze the database structure and thereby improve the query optimizer of SQLite. We will see...

    Now what I want to say is: This is again outside the scope of the SQLiteDatabase plugin. But it shows that there are still a lot of performance improvement possibilities in the MediaLibrary system of MP2 itself. Let's see if I have a look at this after SQLiteDatabase v0.08 is ready...
     
    Last edited:

    Users who are viewing this thread

    Top Bottom