| Author |
Message |
|
|
Hi Paul for those 2 topics,
http://www.jaikoz.net/jaikozforum/posts/list/1530.page
http://www.jaikoz.net/jaikozforum/posts/list/1357.page
Nowdays most comouter langage include tables (listview) that have a VirtualMode, I made a test in c# with 100 millions objects with 2 columns each and this is super fast if implemented with VirtualMode. I know you use java and I'm sure there is an equivalent, I put the code here for c#
Last answer in the thread:
http://social.msdn.microsoft.com/Forums/en-US/csharplanguage/thread/b91ffdc2-12e0-4d8b-860a-f0506f058af0/
Regards
W
|
 |
|
|
Great photos guys
Thanks for sharing
(can't put 5 stars on paul's one, the website put an error when I try)
Take care
Wil
|
 |
|
|
Hi Paul, I was wondering what is the status concerning this feature? I believe you mention you will work on it
Best Regards,
Thanks
Wil
|
 |
|
|
Hi Paul, I was wondering what is the status concerning this feature? I believe you mention you will work on it
Thanks
Wil
|
 |
|
|
Cool - thanks Paul. This feature will be so helpful.
Side note: I run Jaikoz from my WHS (Windows Home Server) on my music lib. So this is a small computer still running a 32 bits OS. I guess I should upgrade to WHS Vail (WHS 2)
|
 |
|
|
I have Jaikoz for a while now but the feature that I really miss the most, the one that will make me say 'Wow' and use it much more often is to be able to apply the AutoCorrect to a folder that has any number of musics (10K, 20K, 100K whatever is the number of songs).
Without this feature this is almost not useful to me as I need to spend so much time dividing the folders into jaikoz manageable chunks and this is too painful and so I did it 2 times in the past and now realize I haven't been using jaikoz for few months because it can't handle huge libraries of musics.
So please add this feature and I'm sure many will really appreciate it.
Technically, I can see few ways of doing this, a) filtering more at the database level, (do not put so much information in memory), OR b) handle in jaikoz the libraries by chunk so basically you can do 1 pass for each smaller chunk and then some pass for just merging the different chunks that have been processed.
For the ui, you can display the data in the table by chunk also, for instance only the first 100 one from the database and then a button to see the next (and previous) 100 one etc...of course ordering by title should send the order in the sql and redisplay the new 100 etc... I hope this help
Thanks a lot for your help
Wilhelm
|
 |
|
|
Exactly according to the active filter and criteria (the simplest is to push this on the db)
The important thing is filling up the caches down (if we go down) and up (if we go up) in a background thread so it will feel like this is instantaneous (like after viewing few records of the current page the cache down should be completed, or you can cache more than 1 page up and down if you want...like 5 page total in memory 2 up 1 present 2 down, page can be 1000 or 100 really depend of performance...).
|
 |
|
|
Another solution can be to have a "Prev" and "Next" button under the table, so you basically display 1000 (cache 1000 up and 1000 down), so we can browse the current 1000, if we press next you just display the next 1000, unload the cache up and replace it by the one that were displayed, load in background the next 1000 in the cache down etc....). I guess the diff is the button trigger the event vs the table slider...
Note: somewhere on the ui you will need to display at which thousand page we are at like if I'm watching the 2000 to 2999 out of 100000 it could display "Pages: 2/100" (otherwise this is more work to inherit from the table and fake the slider to make it continuous)
|
 |
|
|
This is a common visualization issue for huge db like billions of records (DNA, biological info, or even financial data) and the db are share by many users and they are remote.
Sorting can be done for sure there are diff ways (resort on a temp table with only the PKs (primary keys) etc...). The best is if you can find one already implemented.
If you approach the 1000th there are at least 2 implementation 1 can be to just slide the 3 orange rectangle cache through the db (so basically filling up the down cache memory once you arrive at let say 1000-10%), the other one will be soon as you are at 1000th, drop the up cache, the current view become the up cache, the down cache become the current view and now you can load in background the next 1000 from the db to fill up the new down cache...this can go really fast as the db is local, some test will need to be done to see what are the best values, like 1000 versus 100 versus 500....
You need some kind of producer/consumer to fill up the cache in background....The nice thing is the user can look at the 1000 data while the next 1000 are being loaded....
The batch could be a (temp) quick solution in the mean time a buffered table is build (or you extend an existing table) . I didn't look for this for a long time but I will be surprise if there are no implementation outside, java is pretty popular and I know it is use for huge dataset display....
|
 |
|
|
I did implement two in the past for a huge chemical db. I might need to re-implement one in few months for another field.
There may be already some free implementation on the web for the techno you are using, look on the web for buffered table, table with cache, or table with huge dataset to display..things along those words, also may be add the key words of the technology you are using, if .net for instance "ListView table with cache buffer" if java....
I cc a screenshot to explain the concept.
Cheers,
Wil
|
 |
|
|
|
If the db is already there then we do not need the batching, to handle 100,000 we just need to make sure they are not all loaded in memory and use the window cache mechanism for the table display and this will solve all the issues. Batching is just a short cut (quick win) to get the functionalities in case the db wasn't there (and the cached table display). But utlimatly the db+cache table can replace the batching
|
 |
|
|
1 app for tagging is nice. If you start to devise it might become confusing. Like I can experiment with the filter, once I have something l like I add it in the autocorrector, this will even make more sens once you will have the add-ins, then we could experiment and then batch if we like the experiment.
One autocorrector instruction could be "load on table", thought this may or may not be useful, it depend: I can see batch and table being use together for the failed one, like I can really see my self asking to load 100,000 by 1000 and then having all the failed one either kept loaded or loaded from a failed folder at the end "load failed from c:\My Music Failed", so basically I can go on the table and just correct them manually and submit them again.
Ultimately people want to tag their library and update it once in a while (batch or not batch). All this discussion is mostly trying to get around the memory, I do not feel a second app is necessary to solve this issue, batching or having a db could solve this issue, I feel batching could quickly solve the issue in the mean time a db arrive.
For the table and the db, the best (but this is a bit long to implement) is to have a table with a buffer up and down that point to the db, so you only display let say a window of 1000 records and pre fetch 2000 records, 1000 records after and 1000 records before the currently displayed one, so if we move the slider up or down they are already cache (like a cache window of 1000 up and 1000 down), then when we arrive at the end of the display (current window), you display the cache window down and pre fetch the next 1000 and get rid of the 1000 in the cache window up and replace them by the previous current display etc... sorry if I didn't explain well but this work really well this is basically a window view of the data (with a cache window up (for previous data) and a cache window down (for next data) and then you "slide" those 3 windows up and down following what the user want to see)
|
 |
|
|
When you save it just remove them as it does today. If you do not save then the files are not remove (they are appended) and jaikoz keep them in memory until either it finish processing all the files or the memory blow up (I mean this is as it does it today if you do not save, you can try to load 100,000 musics, the memory may blow up at some point, but if you batch by 100 and think about saving in your autocorrector then you are ok - if you think about it this is what we are doing today, we cut our big lib in small pieces to process them by more manageable chunk so the memory do not blow up, automatizing those steps will be a great help and a huge time saver).
Later on you can effectively add a db where you can push the data in the db, but I think this could be a quick solution to batching in the meantime of a db. Also this allow you to keep a low level of memory (if you do not forget to put the save in your autocorrector script)
|
 |
|
|
May be something like this in the Autocorrector:
- Load next 100 files from folder "c:\My Music To Tag"
- ....
- .... (current autocorrector calls)
- ....
- Move & Save in folder "c:\My Music", move failed in "c:\My Music Failed"
Could it be possible with Jaikoz?
thanks
Wil
|
 |
|
|
Something that could be nice, is I give a folder to jaikoz (let say 30,000 mp3) and it will automatically batch them by 300 or 5000 or what ever is the fastest number, once the let say 5000 mp3 are processed and moved on a diff directory, it'll takes the next 5000 and continue....
I guess the only issue there could be if you want to eye check before saving but otherwise this can be a huge time saver. The failed autocorrect mp3s could be move in a diff directory to be reprocess later on automatically or manually....
|
 |
|
|