Unistore: Project Updates

Unistore: Project Updates

Unistore: Project Updates Presenter: Wei Xie Project Members: Jiang Zhou, Mark Reyes, Jason Noble, David Cauthron and Yong Chen Data-Intensive Scalable Computing Laboratory(DISCL) Computer Science Department Texas Tech University We are grateful to the Nimboxx and the Cloud and Autonomic Computing site at Texas Tech University for the valuable support for this project Unistore Overview To build a unified storage architecture (Unistore) for Cloud storage systems with the co-existence and efficient integration of heterogeneous HDDs and SCM (Storage Class Memory) devices Prototype development based on Sheepdog and/or Ceph Data Placement Component Characterization Component Workloads Access patterns

Devices Bandwidth Throughput Block erasure Concurrency Wear-leveling guide I/O Pattern Random/ Sequential Read/write Hot/cold I/O Functions Write_to_SSD Read_from_SSD Write_to_HDD Placement Algorithm Modified Consistent Hash

Team and Leverage Faculty: Yong Chen Post-doc researcher: Jiang Zhou Ph.D. student: Wei Xie Undergraduate student: Mark Reyes Nimboxx: Jason Noble and David Cauthron Experimental platform: Two nodes on DISCI cluster in [email protected] CPU - 2 x 8 Core E5 -2650v2, 2.60GHz Memory - 128GB 3*500GB SAS HDD and 2*200GB SSD Phi 5110pP Coprocessors Used as sheepdog storage nodes 3

Background: Challenges in Data Distribution Requirement of data distribution Scalability Load balance (based on capacity) Data need to be randomly and statistical proportionally distributed according to nodes capacity Handles node addition/removal Data replication for fault-tolerance High performance Throughput of storage nodes need to be fully exploited

Consistent hashing and CRUSH handle the first four problems fairly well CRUSH is a more flexible as it is able to distribute data based on the physical organization of nodes for better fault-tolerance 4 Background: Challenges in Data Distribution Heterogeneous storage environment Distinct throughput NVMe SSD: 2000 or more MB/s SATA SSD: ~500 MB/s Enterprise HDD: ~150 MB/s Large SSDs are becoming available, but still

expensive 1.2TB NVMe Intel 750 costs $1000 1TB SATA Saumsung 640 EVO costs $500 10 or more costly than HDDs SSDs still co-exist with HDDs as accelerator instead of replacing them 5 Background: How to Use SSDs in Cloudscale Storage Traditional way of using SCMs (i.e. SSD) in cloudscale distributed storage: as cache layer

Caching/buffering generates extensive writes to SSD, which wears out the device Need fine-tuned caching/buffering scheme Not fully utilize capacity of SSDs The capacity of SSDs is growing fast Treat SSD-equipped nodes the same level as HDD-equipped nodes No need to do cache replacement or buffer flushing User sees the storage system with combined capacity and maximized performance Less write to SSDs Load-balance aware distribution and performance aware distribution are naturally conflict SSDs are usually smaller but faster, while HDDs larger but slower Existing data distribution algorithms do not consider this problem

6 Project Tasks: Overview Data distribution management of Unistore Modify the data distribution algorithm (Consistent hash) in Sheepdog or CRUSH algorithm in Ceph Achieves load-balance, reliability and performance at the same time for heterogeneous storage Different storage devices are unified and fully utilized Two-mode distribution: BigData15 Short Paper SUORA algorithm

Tracing IO operations and workload characterization Instrument Sheepdog for IO tracing capability Integrate IO workload characterization component to serve as the hint for data distribution Tracing component developed by Mark Reyes 7 Activities Bi-weekly meeting for the team members to report progress and discuss the problems Each student members report the recent research and development progress. Bring up new ideas or discuss current ideas

Presentation slides and meeting minutes are maintained 8 Deliverables Two-mode paper accepted by BigData15 conference SUORA paper completed and preparing for submission A new paper called Tier-CRUSH is in

preparation IO tracing and workload characterization component is being developed Try patent filing 9 Two-Mode Data Distribution Data Objects Data Distributor Distributor Selector Capacity Monitor Traditional data distribution only cares about load-balance, i.e. uses capacity-based distributor We propose to use performancebased and capacity-based distributor at the same time IO Monitor Performancebased Distributor

Capacitybased Distributor Storage Nodes Switch between two mode is based on the use of capacity and IO workload Read and write policy to handle two modes Mode transition strategy to reduce data migration overhead 10 Throughput Improvement 1.8 performance gain here Migration overhead ignored Significant system throughput improvement in a wide range of user input

SUORA Algorithm Multiple tiers, each tier represents a type of storage devices with similar characteristics (performance, capacity) Data placed across different tiers based on hotness Data distributed across different nodes in each tier randomly and uniformly and proportionally to capacity 12 Conclusions

Reconsider data distribution with heterogeneous storage devices with distinct performance metrics Two-mode scheme targets at providing maximized performance while still maintaining load-balance, without drastic change to existing data distribution algorithms Analysis shows potential of the two-mode scheme Still need more trace-based or real world evaluation of the scheme The proposed algorithms received positive feedback from BigData conference On-going/Future Work

Starting to implement the proposed algorithms in Sheepdog or Ceph Continue the development of IO tracing and characterization component Writing a new paper name Tiered-CRUSH that extends CRUSH algorithm to support heterogeneous storage Integrate workload characterization component and data distribution component together Test on the experimental platform 14 Thank You

Please visit: http://cac.ttu.edu/, http://discl.cs.ttu.edu/ Acknowledgement: The [email protected] is funded by the National Science Foundation under grants IIP-1362134 and IIP-1238338. 15 Please take a moment to fill out your L.I.F.E. forms. http://www.iucrc.com Select Cloud and Autonomic Computing Center then select IAB role. What do you like about this project? What would you change? (Please include all relevant feedback.) 16

Recently Viewed Presentations

  • The Five Dysfunctions of a Team

    The Five Dysfunctions of a Team

    The Five Dysfunctions of a Team What do I need to do and to avoid in order to get the most out of my team? Absence of trust Fear of conflict Lack of commitment Avoidance of accountability Inattention to results...
  • OSHA 29 CFR 1910 Subparts E & L

    OSHA 29 CFR 1910 Subparts E & L

    OSHA 29 CFR 1910 Subparts E & L. PPT-006-01. Bureau of Workers' Compensation . PA Training for Health & Safety (PATHS) EGRESS, FIRE PREVENTION & FIRE PROTECTION. The safety of building occupants from fire is dependent upon preventive measures, the...
  • Sapir-Whorf Hypothesis - Gettysburg College

    Sapir-Whorf Hypothesis - Gettysburg College

    Language is a guide to your reality, structuring your thoughts. It provides the framework through which you make sense of the world. See the article "The Sapir-Whorf Hypothesis: Worlds Shaped by Words" To understand the S-W Hypothesis, it helps to...
  • Welcome to ProNetworking - Nottingham Derby BCS

    Welcome to ProNetworking - Nottingham Derby BCS

    Welcome! Your host: Clinton Walker YPG Rep - Nottingham & Derby Housekeeping Fire exits Toilets What is networking? About who you know, not what you know Critical to success in business & career Forging alliances for now & the future...
  • Topic 25 - more array algorithms "To excel

    Topic 25 - more array algorithms "To excel

    students who are practiced and quick with their method code. Skill with the method code allows you to concentrate on the larger parts of the problem. ... order Determine which character occurs most frequently in a file * More array...
  • Identifying the causes of social and ... - Joseph Ciarrochi

    Identifying the causes of social and ... - Joseph Ciarrochi

    Joseph Ciarrochi Document presentation format: Custom Other titles: Palatino ヒラギノ明朝 ProN W3 Arial Didot Zapf Dingbats MS PGothic MS Pゴシック Corbel Wingdings Lucida Handwriting MS 明朝 Calibri Blank Title - Center Title & Bullets - Right Photo - Vertical Title...
  • Social Studies and ELA:

    Social Studies and ELA:

    Alphonse the Camel. Once upon a time there was a camel called Alphonse. For various reasons relating to an unfortunate accident during his birth, the camel had severe back problems.
  • Buttons in SPM5 - fil.ion.ucl.ac.uk

    Buttons in SPM5 - fil.ion.ucl.ac.uk

    In SPM5 Many options automatically provide a brief explanation of what they might be used for or when to select them Sources of plagiarism Alice Grogan, Carolyn McGettigan Buttons in SPM5 SPM5 Manual - The FIL Methods Group Buttons in...