home
products
contribute
download
documentation
forum
Home
Forums
New posts
Search forums
What's new
New posts
All posts
Latest activity
Members
Registered members
Current visitors
Donate
Log in
Register
What's new
Search
Search
Search titles only
By:
New posts
Search forums
Search titles only
By:
Menu
Log in
Register
Navigation
Install the app
Install
More options
Contact us
Close Menu
Forums
MediaPortal 2
Plugin Development
SQLiteDatabase Plugin for MP2
Contact us
RSS
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Reply to thread
Message
<blockquote data-quote="MJGraf" data-source="post: 1029050" data-attributes="member: 17886"><p>Did I ever say that I don't see much room for improvement <img src="" class="smilie smilie--sprite smilie--sprite7" alt=":p" title="Stick Out Tongue :p" loading="lazy" data-shortname=":p" /> ?</p><p></p><p>The issue I apparently had (and certainly still have to a large extent) is that I never worked with SQLite or System.Data.SQLite before and to really tune a database engine for performance you really have to know a lot about the internals of that database. While SQLite itself (the C-Library) is fairly well documented, the documentation of System.Data.SQLite is in my opinion very small. So the only way to find out how things work is to read the source code and I have already found out some very interesting things. But let me start from the beginning:</p><p></p><p>Very early in this thread I shortly explained why we switched to WAL-mode ("write ahead logging"): Threads writing to the database do not block reading threads in WAL-mode, which gives us a noticeable speed improvement in particular when a MP2 Client is in use (i.e. reading the database) while there is an import running (i.e. writing to the database).</p><p>Technically (and very simplified) this works as follows: Writers do not write to the database file itself, they write to a WAL-file. Readers therefore first check whether the data they query is contained in the WAL-file (i.e. it has been changed by a writer before). If so, they read from the WAL file (which does not affect writers because they simply append every new write transactions to the end of the WAL file). If they don't find the queried data in the WAL file, they read from the database file itself.</p><p>This implies that at some point in time the content from the Wal-file has to be transferred to the database file. This is called a "Check-Point". Such a Check-Point may (depending on how it is performed, but this is quite complicated so let's just assume it is like that) block readers and they in particular block the thread in which the Check-Point is performed. </p><p>Now in SQLite standard settings, there is a thing called "Auto-Checkpoint". This means that after a certain number of pages (standard: 1.000) is written to the WAL-file, the next write-commit automatically triggers a Check-Point operation. We use a page-size of 4KB, which means such check-point is executed about every 4MB written to the database. Now my database is 1.7GB in size, which means there are more than 400 Check-Points during my initial import. You can watch that if you use the latest SQLiteDatabase plugin version and have a look into the database directory while an import is running. There is a file ending with "wal", which appears, grows until about 4MB, disappears (= checkpoint) and then reappears again all the time.</p><p>So my idea was that during an import I could switch off the auto-checkpoint feature and switch it back on, when the import is finished. That would result in saving the time of more than 400 checkpoints and only have one checkpoint, which may take a little longer, but my hope was that it doesn't take as long as the 400 short checkpoints.</p><p>But what I found out was something completely different.... I switched off the auto-checkpoint feature in my code, started a new import and watched the database directory and guess what - the wal-file appeared, grew until about 4MB, disappeared and showed up again all the time. What I was expecting is a wal-file, which constantly grows to about 1.7GB and in the end is transferred to the database file and only then disappears - but this was not the case. So apparently there were still regular checkpoints happening - but I switched them off!!?!?</p><p></p><p>After a long research in the internet I found out that there is one point in time when a checkpoint happens in any case - independent from the autocheckpoint setting: When the last connection to the database is closed. Then I thought that this is impossible, because I switched on the ConnectionPool feature in System.Data.SQLite and if we do so, the expected behavior is that when you call "Close();" on a database connection it is not actually closed, but it stays open and is returned to the ConnectionPool to be reused later. But apparently, this was not (really) the case.</p><p>So I digged quite deep into the System.Data.SQLite code and found that they implemented the ConnectionPool with weak references! That means that during every garbage collection cycle all the databases, which are currently in the pool, are disposed (and therefore closed). This really shocked me, because all the tests I did with SQLite settings were pretty much useless.</p><p>As an example: Every connection maintains its own cache. When the connection is closed, the cache is deleted. So my tests on what cache size is optimal, did not make sense since most likely every connection was only used a few times (depending on how often the GC runs) and then closed. In such a scenario the cache-size does of course not really matter...</p><p></p><p>So what we need here in the first place is our own implementation of a connection pool with strong references. Funnily, there is already an implementation in our FireBirdDatabase directory, which is however not used in the current implementation of the FireBird database. There is also a comment in the FireBirdDatabase code saying something like "we don't need the connection pool in the database connection string, we use our own implementation". @[USER=67886]Albert[/USER] are you still around? Is it possible that you found out the same about the FireBirdDatabase and therefore began to implement our own ConnectionPool?</p><p></p><p>Anyway - I will implement our own ConnectionPool, but to make things more comparable I will first write down a final test protocol, because we have to change the one above due to the findings in the meantime. In particular we have to differentiated between "query-time" (i.e. the time it takes until a query finishes as per the SQLDebug.Log) and the "read-time" (i.e. the time until the data from the query is actually read from the database and put into MediaItem objects - this value we can take from my modification to the MediaLibrary on the last page of this thread, which logs this time to our regular server.log).</p><p>Once the test protocoll is ready, I will run the whole test for the current v0.07 version of this plugin (to be published shortly after Alpha3) and write down the results here compacted in one thread.</p><p>After that I will start to implement further improvements (in particular the ConnectionPool) and test them according to this test protocol to make sure the figures are 100% comparable to our clean v0.07.</p><p></p><p>So stay tuned.... <img src="" class="smilie smilie--sprite smilie--sprite2" alt=";)" title="Wink ;)" loading="lazy" data-shortname=";)" /></p></blockquote><p></p>
[QUOTE="MJGraf, post: 1029050, member: 17886"] Did I ever say that I don't see much room for improvement :p ? The issue I apparently had (and certainly still have to a large extent) is that I never worked with SQLite or System.Data.SQLite before and to really tune a database engine for performance you really have to know a lot about the internals of that database. While SQLite itself (the C-Library) is fairly well documented, the documentation of System.Data.SQLite is in my opinion very small. So the only way to find out how things work is to read the source code and I have already found out some very interesting things. But let me start from the beginning: Very early in this thread I shortly explained why we switched to WAL-mode ("write ahead logging"): Threads writing to the database do not block reading threads in WAL-mode, which gives us a noticeable speed improvement in particular when a MP2 Client is in use (i.e. reading the database) while there is an import running (i.e. writing to the database). Technically (and very simplified) this works as follows: Writers do not write to the database file itself, they write to a WAL-file. Readers therefore first check whether the data they query is contained in the WAL-file (i.e. it has been changed by a writer before). If so, they read from the WAL file (which does not affect writers because they simply append every new write transactions to the end of the WAL file). If they don't find the queried data in the WAL file, they read from the database file itself. This implies that at some point in time the content from the Wal-file has to be transferred to the database file. This is called a "Check-Point". Such a Check-Point may (depending on how it is performed, but this is quite complicated so let's just assume it is like that) block readers and they in particular block the thread in which the Check-Point is performed. Now in SQLite standard settings, there is a thing called "Auto-Checkpoint". This means that after a certain number of pages (standard: 1.000) is written to the WAL-file, the next write-commit automatically triggers a Check-Point operation. We use a page-size of 4KB, which means such check-point is executed about every 4MB written to the database. Now my database is 1.7GB in size, which means there are more than 400 Check-Points during my initial import. You can watch that if you use the latest SQLiteDatabase plugin version and have a look into the database directory while an import is running. There is a file ending with "wal", which appears, grows until about 4MB, disappears (= checkpoint) and then reappears again all the time. So my idea was that during an import I could switch off the auto-checkpoint feature and switch it back on, when the import is finished. That would result in saving the time of more than 400 checkpoints and only have one checkpoint, which may take a little longer, but my hope was that it doesn't take as long as the 400 short checkpoints. But what I found out was something completely different.... I switched off the auto-checkpoint feature in my code, started a new import and watched the database directory and guess what - the wal-file appeared, grew until about 4MB, disappeared and showed up again all the time. What I was expecting is a wal-file, which constantly grows to about 1.7GB and in the end is transferred to the database file and only then disappears - but this was not the case. So apparently there were still regular checkpoints happening - but I switched them off!!?!? After a long research in the internet I found out that there is one point in time when a checkpoint happens in any case - independent from the autocheckpoint setting: When the last connection to the database is closed. Then I thought that this is impossible, because I switched on the ConnectionPool feature in System.Data.SQLite and if we do so, the expected behavior is that when you call "Close();" on a database connection it is not actually closed, but it stays open and is returned to the ConnectionPool to be reused later. But apparently, this was not (really) the case. So I digged quite deep into the System.Data.SQLite code and found that they implemented the ConnectionPool with weak references! That means that during every garbage collection cycle all the databases, which are currently in the pool, are disposed (and therefore closed). This really shocked me, because all the tests I did with SQLite settings were pretty much useless. As an example: Every connection maintains its own cache. When the connection is closed, the cache is deleted. So my tests on what cache size is optimal, did not make sense since most likely every connection was only used a few times (depending on how often the GC runs) and then closed. In such a scenario the cache-size does of course not really matter... So what we need here in the first place is our own implementation of a connection pool with strong references. Funnily, there is already an implementation in our FireBirdDatabase directory, which is however not used in the current implementation of the FireBird database. There is also a comment in the FireBirdDatabase code saying something like "we don't need the connection pool in the database connection string, we use our own implementation". @[USER=67886]Albert[/USER] are you still around? Is it possible that you found out the same about the FireBirdDatabase and therefore began to implement our own ConnectionPool? Anyway - I will implement our own ConnectionPool, but to make things more comparable I will first write down a final test protocol, because we have to change the one above due to the findings in the meantime. In particular we have to differentiated between "query-time" (i.e. the time it takes until a query finishes as per the SQLDebug.Log) and the "read-time" (i.e. the time until the data from the query is actually read from the database and put into MediaItem objects - this value we can take from my modification to the MediaLibrary on the last page of this thread, which logs this time to our regular server.log). Once the test protocoll is ready, I will run the whole test for the current v0.07 version of this plugin (to be published shortly after Alpha3) and write down the results here compacted in one thread. After that I will start to implement further improvements (in particular the ConnectionPool) and test them according to this test protocol to make sure the figures are 100% comparable to our clean v0.07. So stay tuned.... ;) [/QUOTE]
Insert quotes…
Verification
Post reply
Forums
MediaPortal 2
Plugin Development
SQLiteDatabase Plugin for MP2
Contact us
RSS
Top
Bottom