Hadoop and Big Data Security

Anthem Hadoop Access Control
​
Overview:
​
With the ever-growing nature of big data, security issues such as fine grained access control have become very prevalent. The purpose of this project is to investigate a fine grained solution to protect Protected Health Information and Personally Identifiable Information. Currently, data analysts are having to create two files based on the security level of the user viewing the data.
​
Our Team is tasked with finding a data level way of masking information and implementing access controls based on who is viewing the information.
​
Project Team:
​
Elena Edelstein
Main technical POC; will guide in initial implementation; provide VM requirements; provide software requirements; closely monitor the teams progress; aid in issues that may arise in the virtual envirnment
Venkat Kambalapally
Technical support locally; will be available to possible come to KSU campus and test our virtual machine/track progress
Venkat.Kambalapally@anthem.com
​
Justin McTiernan - Team Leader
Coordinate team meetings; Track progress of the project; Keep project schedule
Compile and turn in necessary documents; Main contact between team and client
Jmctiern@students.kennesaw.edu
404-625-2028
​
Josh Scott
Contribute to Project deliverables; Investigation; Documentation; Setup; Testing; Final Report and Presentation
Jscot164@students.kennesaw.edu
918-527-8527
​
Prakash Rai
Working together as a team; Research; Review; Identifying issues; Testing and development.
4044574257
​
Logan Smith
Implementation of test environment; Research and testing of potential solutions
lsmit395@students.kennesaw.edu
678-294-6283
​
Jared O’Brien
Technical writing; Quality testing; Research and implementation; Documentation
jobrie23@students.kennesaw.edu
678-925-6450
​
Jack Zheng
Facilitate project progress; advise on project management and technical solutions.
​
Monica O’Neal
Will monitor the teams progress throughout the semester; aid in coordination between the team and Anthem
​
Project website:
​
http://jmctiern.wixsite.com/anthemsec
​
Final Deliverables:
​
-
Research report on access control (authorization) on Hadoop
-
Solution implementation/demonstration on Linux
-
Technical documentation of system implementation including a list of requirements (use cases) for the implementation.
Milestones:
​
#2 - By 2/17/2017
-
Testing site set up: Linux (RedHat, CentOS), Hadoop
-
All sample data prepared and loaded to the system
-
Requirements/expectations detailed from the client
#3 - By 3/17/2017
-
Preliminary investigation of all issues and present results
-
Evaluate and select focus areas for the next phase
#4 - By 4/14/2017
-
Focus on the selected issues and present more solutions
Future milestone meetings date/time:
​
2/17, 3/17, 4/14, Fridays, 1PM, Anthem.
​
Communication and Meeting Planning:
​
anthemsec.slack.com via Slack app for internal discussion, Google Drive
​
Project Schedule and Tasks Planning:
**we will be coordinating either weekly or bi-weekly meetings with Elena as the project progresses**
​
2/3: Meeting to get VM’s set up
-
4 VM’s
-
2-4 Cores
-
16 GB RAM
-
100-512 GB Storage
2/3-2/6: Begin Installation of software (provided by Elena) and then transferred to a network drive
-
Downloading will begin either on team member machines or on the virtual machine itself
-
All software will be placed in the virtual environment but will be implemented with the guidance of Elena
2/5: Project Plan submission
2/6-2/13: Load sample data onto virtual machines
-
Hadoopilluminated.com has big data sample sets available
2/6-2/13: Begin setting up virtual environment with Elena
-
Elena will remotely help in the initial setup of the environment
2/13-2/15: Testing of virtual environment
-
Running the big data sets through the machines
-
Final checks for organization of data and data distribution
2/17: Milestone #2 meeting
2/18-2/28: Investigation of preliminary issues
-
Data availability and access controls based on different views
3/1-3/17: Select focus areas to do further investigation and possible solutions
3/17: Milestone #3 meeting
3/18-4/14: Focus on selected issues and investigate possible solutions
3/18-4/14: Implement and test possible solutions
3/18-4/14: Weigh pros and cons of solution (security, implementation, effectiveness etc.)
4/14: Milestone #4 present possible solution