SQLiteDatabase Plugin for MP2 (1 Viewer)

MJGraf

Retired Team Member
  • Premium Supporter
  • January 13, 2006
    2,478
    1,385
    Oh, and just for the fun of it:
    SQlite is able to create pure in-memory databases. That's basically the same as when you, breese, put your database on a RAM drive - just without the overhead of the RAM drive software and the windows file system. I just did an import run with a pure in-memory database and after about 3/4 of the import I got an out-of-memory.exception :D
    The interesting part is: When I take the time until the exception was thrown and calculate how long it would have taken until the import would have been finished, I end up with something like 16-17 minutes for my ImportTest. Compared to the 20 minutes with my SSD, this is quite a long time - so my suspicion is that the database is already quite effective. Most of the time is probably spent by reading the tags from the files...

    But I have one further idea: Does anyone know whether our ThumbnailSmall and ThumbnailLarge aspects hold compressed image data (e.g. the bytes directly from the compressed jpg file) or is this uncompressed image data? In the latter case, it makes sense that we compress the byte[] data before we store it in the database - for performance and for storage space reasons...

    And breese: thanks again for all your help. I know I'm teasing all of you for quite some time now, but I don't want to have too many beta versions out there... As soon as the new system.data.sqlite version is out, I'll integrate it and publish a test version. I'm looking at the system.data.sqlite homepage everyday waiting for the new version to be published :D
     

    breese

    Retired Team Member
  • Premium Supporter
  • July 11, 2011
    3,902
    770
    66
    Arlington Heights, Illinois
    Home Country
    United States of America United States of America
    Most of the time is probably spent by reading the tags from the files...

    It is interesting you say that..... My other system does something MP might want to think about.
    Durring the initial scan, it reads in the directory, album, and titles. With a large collection of media it works real well
    Then, the first time someone goes to that album, the system does a compare againest the database and fills in the missing or changed information. A normal album of 9 to 15 songs only takes about a second to complete.

    Why import everything / something until it is actually needed? If the basic is there, the inital scan is fast there-by returning contol over to the system to do other things.

    Oh, and as for Teasing me... I do understand..... :cool:
     

    morpheus_xx

    Retired Team Member
  • Team MediaPortal
  • March 24, 2007
    12,073
    7,459
    Home Country
    Germany Germany
    Does anyone know whether our ThumbnailSmall and ThumbnailLarge aspects hold compressed image data (e.g. the bytes directly from the compressed jpg file) or is this uncompressed image data? In the latter case, it makes sense that we compress the byte[] data before we store it in the database - for performance and for storage space reasons...
    The thumbnails are stored as compressed jpg, so no further optimization possibilities in this part.

    Why import everything / something until it is actually needed? If the basic is there, the inital scan is fast there-by returning contol over to the system to do other things.
    I thought about this also many times. In principle a first importer run could scan all known file extensions and add the path to the ML. Then a 2nd import could run all the known metadata extractors to add details.

    A problem is, that i.e. the AudioAspect is not filled after 1st pass, so a audio file won't show up inside "Audio Library". This could mean that the MDE would need to create an empty aspect first, then fills details in 2nd run.

    The changes to importer would be big (but not impossible), i.e. MDEs would need to get extended with a 1st run mode and all existing logic would need to be adopted to this way.
     

    MJGraf

    Retired Team Member
  • Premium Supporter
  • January 13, 2006
    2,478
    1,385
    To give you a bit of an example of what problems I face with the ReadTests - most likely due to the windows operating system read cache - I have done some ReadTests now with the current status version of SQLiteDatabase. I've only measured the ReadTimes (not the QueryTimes).

    The first result I got was this:
    locking_mode=Exclusive / Shared Cache / Page Size 4KB
    ReadTimes:
    ReadTest 1: 13576ms, 11891ms
    ReadTest 2: 330ms, 208ms
    ReadTest 3: 27.7ms, 9.6ms
    ReadTest 4: 844.8ms, 61.5ms
    As you see, this is with shared cache enabled, exclusive locking mode and a page size of 4KB. My first thought was: Damn, now with exclusive locking mode the ImportTest, i.e. write performance, is faster, but the ReadTests are much slower.

    So I left everything as it was and just set the locking mode to normal. Restarted MP2 Server and these are the results:
    locking_mode=Normal / Shared Cache / Page Size 4KB
    ReadTimes:
    ReadTest 1: 12079ms, 11809ms
    ReadTest 2: 238ms, 208ms
    ReadTest 3: 15.4ms, 9.9ms
    ReadTest 4: 117.5ms, 42.1ms
    In particular the first pass of ReadTest 2 and ReadTest4 made me think that my suspicion was right - exclusive locking mode is slower than normal locking mode. But then again I couldn't believe this because everyone in various forums says, exclusive mode is faster. So again I didn't change anything but setting exclusive locking mode, restarted MP2 server and did exactly the same test again. Results as follows:

    locking_mode=Exclusive / Shared Cache / Page Size 4KB
    ReadTimes:
    ReadTest 1: 12110ms, 11736ms
    ReadTest 2: 243ms, 236ms
    ReadTest 3: 24.4ms, 12.8ms
    ReadTest 4: 117.5ms, 37.2
    Now all of a sudden, locking mode exclusive was about as fast as normal locking mode - maybe just a bit slower in ReadTest 3.

    This time I left everything as it was, locking mode still exclusive. Restarted MP2 Server and just did the test again:
    locking_mode=Exclusive / Shared Cache / Page Size 4KB
    ReadTimes:
    ReadTest 1: 12208ms, 11647ms
    ReadTest 2: 237ms, 218ms
    ReadTest 3: 15.7ms, 10.2ms
    ReadTest 4: 122.8ms, 45.7ms
    Now all the ReadTests - including ReadTest 3 - were about as fast as with normal locking mode.

    Once again no changes, a restart of MP2 Server and the same test again:
    locking_mode=Exclusive / Shared Cache / Page Size 4KB
    ReadTimes:
    ReadTest 1: 12191ms, 11761ms
    ReadTest 2: 259ms, 207ms
    ReadTest 3: 18.9ms, 9.8ms
    ReadTest 4: 114.7ms, 43.8ms
    Roughly the same results as with the last test...

    Now the conclusion of all this is that probably the windows OS read cache makes the test results somehow unpredictable. After a certain number of times that I repeat the same tests (with a restart of MP2 Server between the tests, so the SQLite inbuilt cache is definitely empty) it seems that we get more or less reliable results, but I don't know after what number of repeats - and so I would have to guess whether a result was caused by my optimizations or by the respective parts of the database file being in windows read cache.

    So while under normal using conditions there is of course also the windows OS cache, which makes MP2 Server faster. But that doesn't help for testing and improving SQLiteDatabase. As I already mentioned above, I saw a tool somewhere in the internet, which is able to clear the windows read cache. I will try to find that again and then see whether I get consistent results...
     

    MJGraf

    Retired Team Member
  • Premium Supporter
  • January 13, 2006
    2,478
    1,385
    Ok, found it again.
    The original tool I meant was "RamMap" from Sysinternals. This one can do a lot more than just flushing the read cache (aka "stand by list") of windows, but it is a graphical tool without a command line interface.
    Someone has just put the code to flush the stand by list into a small command line tool here: http://www.mediafire.com/download/epo2qh9s1ic4q6f/EmptyStandbyList.zip
    Will try whether that helps making the results more consistent...
     

    MJGraf

    Retired Team Member
  • Premium Supporter
  • January 13, 2006
    2,478
    1,385
    Yep, that does the trick... For now let's just to ReadTest 2.

    Stop MP2 Server, Flush Read Cache, Start MP2 Server (locking_mode=Exclusive / Shared Cache / Page Size 4KB), Start MP2 Client.
    ReadTest 2 (only ReadTime): 888ms, 204ms
    So the first read is MUCH slower now without windows read cache.

    Stop MP2 Server, Flush Read Cache, Start MP2 Server (locking_mode=Exclusive / Shared Cache / Page Size 4KB), Start MP2 Client.
    ReadTest 2 (only ReadTime): 858ms, 202ms
    Same test procedure, more or less the same result.

    Stop MP2 Server, DO NOT Flush Read Cache, Start MP2 Server (locking_mode=Exclusive / Shared Cache / Page Size 4KB), Start MP2 Client.
    ReadTest 2 (only ReadTime): 240ms, 182ms
    Just to prove I was right - same procedure but without flushing the read cache - only restarting MP2 Server - and the first read time is MUCH faster - caused by the windows read cache. Second pass is also faster (don't know why, because even with flushing the read cache, after executing the query for the first time, the file should be in windows read cache and the result set should be in SQLiteCache. Maybe because ReadTest 2 is automatically executed twice with only a few ms between the tests and this may be to fast for the windows read cache...)

    Stop MP2 Server, DO NOT Flush Read Cache, Start MP2 Server (locking_mode=Exclusive / Shared Cache / Page Size 4KB), Start MP2 Client.
    ReadTest 2 (only ReadTime): 263ms, 195ms
    Same test, more or less the same result. Second pass this time is closer to the test results with flushing windows read cache...

    Stop MP2 Server, Flush Read Cache, Start MP2 Server (locking_mode=Exclusive / Shared Cache / Page Size 4KB), Start MP2 Client.
    ReadTest 2 (only ReadTime): 854ms, 191ms
    A last time with flushing os cache between restart and we are back to the much slower results from above.

    And now another interesting test and result: From the last test, I left the MP2 Server running and flushed the windows os cache while the MP2 Server was running and then did the same test again:
    ReadTest 2 (only ReadTime): 212ms, 208ms
    That means that the SQLiteCache is actually working very good because when running the test for a second time while the server is running SQLite doesn't care about the windows os read cache being flushed. So it seems we have nearly all we need in the SQLite cache and don't go back to the disk/windows os cache very often.

    Once again: Leave the MP2 Server running, flush windows OS read cache, start MP2 Client:
    ReadTest 2 (only ReadTime): 213ms, 199ms
    Same as above, the flushing of OS read cache does not affect our SQLite speed.

    So what conclusion can we take from these results:
    • First of all, when we do not have any cache (i.e. restart MP2 Server to cleat SQLite cache AND flush OS cache), the results are 888ms, 858ms, 854ms. The average of this is 867ms with a deviation between +2.4% and -1.5%. I would consider this as identical results. The deviation is probably caused by other background activity. These results are consistent, but they are unrealistic. No one manually flushes its OS cache. These results are only realistic, if someone freshly restarts its MP2 Server and then enters the audio section in MP2 Client for the first time after the restart. For anything after that - be it just exiting and reentering the audio section or even after a restart of MP2 Server service, the real results will be much better.
    • When we can only rely on the Windows OS cache - but not the SQLite cache (i.e. after restarting the MP2 Server but without flushing the OS cache), we get 240ms, 263ms. The average is 251.5ms with a deviation of +/- 4.5%. Also quite consistent results - but only realistic after a restart of MP2 Server service.
    • When both caches kick in, we get 204ms, 202ms, 182ms, 195ms, 191ms, 208ms, 199ms. That's an average of 197,3ms with a deviation from +5.4% to -8.8%. These results are probably the most realistic ones - but we have the biggest deviation.
    The only thing I can propose here is the following: We could just ignore the respective first test result to make sure, that both caches are up and running. This gives us the most realistic results when using MP2 "normally". On the other hand we have to say that anything +/- about 10% is not an improvement but coincidence. I think this makes sense because I'm not looking for +3% speed improvements that would probably not be noticed by anyone. I'm looking for +20%, +30% and +50% speed improvements - if possible.

    So what I would do in the following for ReadTests is:
    • Start the MP2 Server
    • Leave the MP2 Server running all the time while I do the ReadTests.
    • Do every Read Test three times.
    • Discard the first result of each ReadTest and only show the second and third result.
    • For ReadTest2 (which is automatically executed twice), I would discard result 1 and 2 and display result 3 and 4.
    • I would probably leave out the QueryTime in the future because this is really taking a lot of time to get it together and I think at this stage it doesn't really bring much more valuable information.
    So if anyone out there has reasons for not doing so, please speak up loud now :D
     

    MJGraf

    Retired Team Member
  • Premium Supporter
  • January 13, 2006
    2,478
    1,385
    So to have a starting point - these are the current results with the ReadTest protocol as explained above:

    locking_mode=Exclusive / Shared Cache / Page Size 4KB / Discarded the respective first result
    ReadTimes:
    ReadTest 1: 11810ms, 11966ms (23776ms)
    ReadTest 2: 230ms, 202ms (432ms)
    ReadTest 3: 6.1ms, 6.9ms (13.0ms)
    ReadTest 4: 46.7ms, 33.5ms (80.2ms)

    The fact that ReadTest 2 is now much quicker than in all the tests before is that in the previous test protocol I did a restart of MP2 Server between ReadTest 1 and ReadTest 2. ReadTest 1 is a VERY heavy query, which already brings a lot of data into the SQLite cache. ReadTest 2 can apparently take advantage of this data in the cache. When I restart MP2 Server between ReadTest 1 and ReadTest 2, the results of ReadTest 2 are back to about 10-12ms (when discarding the first test result). So this is explainable and reproducible and I'll take these results now as a new starting point.
     

    MJGraf

    Retired Team Member
  • Premium Supporter
  • January 13, 2006
    2,478
    1,385
    ok, one of my iprovement ideas was the following:
    Currently, when we read data from the database, we use the generic IDataReader's GetValue(int colIndex) method and we cast the resulting object to the type we expect. system.data.sqlite's SQLiteDataReader, however, has native methods for different data types, such as GetBytes, GetGuid, GetString, etc., which directly return the desired type.
    What I tried now is to use these native methods at least for byte[] values (in particular images, because they are huge in size) and Guid values (because we use them very often). The other values were for simplification purposes still read with GetValue. Result as follows:

    locking_mode=Exclusive / Shared Cache / Page Size 4KB / Discarded the respective first result / Use non-generic read methods of SQLiteDataReader for byte[] fields and Guids
    ReadTest 1: 11746ms, 12001ms (23747ms)
    ReadTest 2: 214ms, 219ms (433ms)
    ReadTest 3: 10.2ms, 5.9ms (16.1ms)
    ReadTest 4: 44.1ms, 44.6ms (88.7ms)

    The result is that according to the +/- 10% definition above, ReadTests 1, 2 and 4 are about the same speed, ReadTest 3 is even slower (although I would bet this is coincidence again when you look at the two results of the two test passes. The discarded first result of ReadTest 3 was 8.2ms by the way so no idea why the second pass was slower...)

    Anyway - this doesn't seem the right direction. Back to square one...
     

    Users who are viewing this thread

    Top Bottom