Data Security

Data security is a major concern for all organizations. The Data Mine works hard to ensure all data is secure and protected. This page will provide you with information on how to keep your data secure in The Data Mine.

Virtual Meetings

Make sure you are in a secure and private location when having your virtual meetings with mentors. If you are in a public location, use headphones to ensure that your conversation is private. Avoid having your mentor meetings in open spaces where others can hear your conversation. Be mindful of your surroundings and think about who can see your computer screen. When working on your Data Mine project do not leave your computer unattended when in a public space.

Virtual Private Network (VPN)

"A VPN, which stands for virtual private network, establishes a digital connection between your computer and a remote server owned by a VPN provider, creating a point-to-point tunnel that encrypts your personal data, masks your IP address, and lets you sidestep website blocks and firewalls on the internet. This ensures that your online experiences are private, protected, and more secure." Microsoft

When working on Data Mine projects using public wi-fi consider using a VPN. Purdue students, staff, and faculty have access to Cisco AnyConnect for free. Here are the instructions for installing Cisco AnyConnect

Project Details

Do not share sensitive information about your project with anyone outside of your team. This includes other students, friends, family, employers, and anyone else who is not directly involved in your project.

During the job interview process, you may be asked about your project. Keep the details of your project high level and do not share any proprietary information about the company or data. You can inform the employer that you’ve signed an NDA and cannot share specific details about the project.

Access Control

Access Control is the process of controlling who can access what data. The idea behind this is that there are only certain levels of data that people can see. There are usually different permission levels and admins that control different groups. Having access control prevents people from seeing data they shouldn’t have access to.

Characterization of Big Data

Volume: The quantity of data has to be terabytes to zettabytes.

Variety: Data comes in many forms like images, sounds, videos, structured and unstructured data types. Making it difficult to analyze.

Data sources: Data is usually streamed in from many different platform and sources. The integration process can be complex and often times multiple dataset are required to make inferences rather than a simple stand alone dataset.

Data Storage/Sharing

Corporate partner data should never be stored on your personal computers. If the data is stored on your personal computer please follow the steps below on how to delete this data from your personal computer.

Corporate partner data should never be shared. If your team is using generative AI for your project, you should never input proprietary company data into the generative AI prompt.

This is a breach of the NDA you signed earlier in the semester.

Delete Data from Windows

When data is accidentally downloaded onto a Windows machine, the commands below can be utilized to ensure that it is fully deleted.

This should be accompanied by an email to datamine@purdue.edu to document the deletion.

Windows sdelete

Microsoft provides the sdelete function that can be used to securely delete data and ensure that it cannot be recreated. This is like the shred function in Unix.

The link to download the sdelete executable can be found here: learn.microsoft.com/en-us/sysinternals/downloads/sdelete

One the files are downloaded; they will need to be extracted before use.

After they have been extracted, you can run the executable directly. The command below is an example of if sdelete was extracted into the Downloads folder and the file that we wanted to delete was on the Desktop.

C:\Users\userName\Downloads\sdelete.exe C:\Users\userName\Desktop\bad_file.txt

Delete Data from macOS

Citation

  1. Bertino, "Data Security and Privacy: Concepts, Approaches, and Research Directions," 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), Atlanta, GA, USA, 2016, pp. 400-407, doi: 10.1109/COMPSAC.2016.89.

“What Is a VPN? Why Should I Use a VPN?: Microsoft Azure.” Why Should I Use a VPN? | Microsoft Azure, azure.microsoft.com/en-us/resources/cloud-computing-dictionary/what-is-vpn. Accessed 5 Feb. 2024.