SharkDB: an in-memory column-oriented storage for trajectory analysis

dc.contributor.authorZheng, Bolong
dc.contributor.authorWang, Haozhou
dc.contributor.authorZheng, Kai
dc.contributor.authorSu, Han
dc.contributor.authorLiu, Kuien
dc.contributor.authorShang, Shuo
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.contributor.institutionThe University of Queensland, Brisbane, Australia
dc.contributor.institutionPivotal Incorporated, San Francisco, USA
dc.contributor.institutionSchool of Computer Science and Techonology, Soochow University, Suzhou, China
dc.contributor.institutionBig Data Research Center, University of Electronic Science and Technology of China, Chengdu, China
dc.date.accessioned2017-05-21T05:30:10Z
dc.date.available2017-05-21T05:30:10Z
dc.date.issued2017-05-05
dc.date.published-online2017-05-05
dc.date.published-print2018-03
dc.description.abstractThe last decade has witnessed the prevalence of sensor and GPS technologies that produce a high volume of trajectory data representing the motion history of moving objects. However some characteristics of trajectories such as variable lengths and asynchronous sampling rates make it difficult to fit into traditional database systems that are disk-based and tuple-oriented. Motivated by the success of column store and recent development of in-memory databases, we try to explore the potential opportunities of boosting the performance of trajectory data processing by designing a novel trajectory storage within main memory. In contrast to most existing trajectory indexing methods that keep consecutive samples of the same trajectory in the same disk page, we partition the database into frames in which the positions of all moving objects at the same time instant are stored together and aligned in main memory. We found this column-wise storage to be surprisingly well suited for in-memory computing since most frames can be stored in highly compressed form, which is pivotal for increasing the memory throughput and reducing CPU-cache miss. The independence between frames also makes them natural working units when parallelizing data processing on a multi-core environment. Lastly we run a variety of common trajectory queries on both real and synthetic datasets in order to demonstrate advantages and study the limitations of our proposed storage.
dc.description.sponsorshipThis work is partially supported by Natural Science Foundation of China (No. 61502324 and No. 61532018).
dc.eprint.versionPost-print
dc.identifier.citationZheng B, Wang H, Zheng K, Su H, Liu K, et al. (2017) SharkDB: an in-memory column-oriented storage for trajectory analysis. World Wide Web. Available: http://dx.doi.org/10.1007/s11280-017-0466-9.
dc.identifier.doi10.1007/s11280-017-0466-9
dc.identifier.issn1386-145X
dc.identifier.issn1573-1413
dc.identifier.journalWorld Wide Web
dc.identifier.urihttp://hdl.handle.net/10754/623667
dc.internal.reviewer-noteEmbargo until (dd/mm/yyyy): 05/05/2018
dc.publisherSpringer Nature
dc.relation.urlhttp://link.springer.com/article/10.1007/s11280-017-0466-9
dc.rightsThe final publication is available at Springer via http://dx.doi.org/10.1007/s11280-017-0466-9
dc.subjectSpatial database
dc.subjectTrajectory
dc.subjectIn-memory
dc.subjectStorage
dc.titleSharkDB: an in-memory column-oriented storage for trajectory analysis
dc.typeArticle
display.details.left<span><h5>Type</h5>Article<br><br><h5>Authors</h5><a href="https://repository.kaust.edu.sa/search?spc.sf=dc.date.issued&spc.sd=DESC&f.author=Zheng, Bolong,equals">Zheng, Bolong</a><br><a href="https://repository.kaust.edu.sa/search?spc.sf=dc.date.issued&spc.sd=DESC&f.author=Wang, Haozhou,equals">Wang, Haozhou</a><br><a href="https://repository.kaust.edu.sa/search?spc.sf=dc.date.issued&spc.sd=DESC&f.author=Zheng, Kai,equals">Zheng, Kai</a><br><a href="https://repository.kaust.edu.sa/search?spc.sf=dc.date.issued&spc.sd=DESC&f.author=Su, Han,equals">Su, Han</a><br><a href="https://repository.kaust.edu.sa/search?spc.sf=dc.date.issued&spc.sd=DESC&f.author=Liu, Kuien,equals">Liu, Kuien</a><br><a href="https://repository.kaust.edu.sa/search?spc.sf=dc.date.issued&spc.sd=DESC&f.author=Shang, Shuo,equals">Shang, Shuo</a><br><br><h5>KAUST Department</h5><a href="https://repository.kaust.edu.sa/search?spc.sf=dc.date.issued&spc.sd=DESC&f.department=Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division,equals">Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division</a><br><br><h5>Online Publication Date</h5>2017-05-05<br><br><h5>Print Publication Date</h5>2018-03<br><br><h5>Date</h5>2017-05-05</span>
display.details.right<span><h5>Abstract</h5>The last decade has witnessed the prevalence of sensor and GPS technologies that produce a high volume of trajectory data representing the motion history of moving objects. However some characteristics of trajectories such as variable lengths and asynchronous sampling rates make it difficult to fit into traditional database systems that are disk-based and tuple-oriented. Motivated by the success of column store and recent development of in-memory databases, we try to explore the potential opportunities of boosting the performance of trajectory data processing by designing a novel trajectory storage within main memory. In contrast to most existing trajectory indexing methods that keep consecutive samples of the same trajectory in the same disk page, we partition the database into frames in which the positions of all moving objects at the same time instant are stored together and aligned in main memory. We found this column-wise storage to be surprisingly well suited for in-memory computing since most frames can be stored in highly compressed form, which is pivotal for increasing the memory throughput and reducing CPU-cache miss. The independence between frames also makes them natural working units when parallelizing data processing on a multi-core environment. Lastly we run a variety of common trajectory queries on both real and synthetic datasets in order to demonstrate advantages and study the limitations of our proposed storage.<br><br><h5>Citation</h5>Zheng B, Wang H, Zheng K, Su H, Liu K, et al. (2017) SharkDB: an in-memory column-oriented storage for trajectory analysis. World Wide Web. Available: http://dx.doi.org/10.1007/s11280-017-0466-9.<br><br><h5>Acknowledgements</h5>This work is partially supported by Natural Science Foundation of China (No. 61502324 and No. 61532018).<br><br><h5>Publisher</h5><a href="https://repository.kaust.edu.sa/search?spc.sf=dc.date.issued&spc.sd=DESC&f.publisher=Springer Nature,equals">Springer Nature</a><br><br><h5>Journal</h5><a href="https://repository.kaust.edu.sa/search?spc.sf=dc.date.issued&spc.sd=DESC&f.journal=World Wide Web,equals">World Wide Web</a><br><br><h5>DOI</h5><a href="https://doi.org/10.1007/s11280-017-0466-9">10.1007/s11280-017-0466-9</a><br><br><h5>Additional Links</h5>http://link.springer.com/article/10.1007/s11280-017-0466-9</span>
kaust.personShang, Shuo
orcid.authorZheng, Bolong
orcid.authorWang, Haozhou
orcid.authorZheng, Kai
orcid.authorSu, Han
orcid.authorLiu, Kuien
orcid.authorShang, Shuo
refterms.dateFOA2018-05-05T00:00:00Z
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
sharkdb.pdf
Size:
676.35 KB
Format:
Adobe Portable Document Format
Description:
Accepted Manuscript