Mage Data

Tag: Sensitive Data Discovery

  • Are There Good Open Source Tools for Sensitive Data Discovery?

    Are There Good Open Source Tools for Sensitive Data Discovery?

    Open-source tools have come into their own in the past decade, including tools for sensitive data discovery. What used to be the domain of large corporations has been democratized, and teams of passionate people can (and do) develop amazing tools. However, with the ever-growing number of data privacy and security laws, the stakes around data classification have never been higher. Getting sensitive data discovery right has significant consequences…so it’s critical you understand what you’re getting with these tools, and how you can use them in ways that will keep you (and your customer and employee data) safe.

    What Makes Data Discovery Tools Open-Source?

    We’ve already covered what makes software open source in this article in depth , but we want to give a quick recap of what we’ll be discussing here. Unlike closed-source tools, free sensitive data discovery tools are released under a license allowing others to use and alter the software for their purposes freely. Generally, instead of being created and owned by a corporation, open-source software is developed by a passionate community, who collaborate to create new features and often determine future direction democratically.

    Many talented people are working on great open-source sensitive data discovery tools like OpenDataDiscovery, ReDiscovery, DataDefender, and more. Consequently, to answer the question in the title of the article, there are good open-source tools for sensitive data discovery. However, that’s not necessarily the question you should be asking—instead, you should be trying to determine if they’ll be right for your company. And one of the best ways to make that determination is through a SWOT Analysis, taking a detailed look at the Strengths, Weaknesses, Opportunities, and Threats that come from using open-source tools for data discovery.

    Data Discovery Tools: Strengths

    First up are the strengths—the things that open-source data discovery tools do well.

    Interoperability and Flexibility

    Because there are generally a variety of perspectives involved in open-source tools, there’s often little incentive to hide features and programs behind walled gardens. In this case, that often translates into tools with a wide range of integrations and connections for data. And even when a certain database type isn’t supported, these tools often provide a way for you to build the integration yourself, ensuring that getting data is rarely a roadblock.

    Price

    And, of course, the best price you can get for anything is free. That could mean you save a bit of money or free up resources to invest in areas that need it more. Whatever the case, it will be hard to get a better deal than what you get with open-source tools.

    Data Discovery Tools: Weaknesses

    Of course, no software is perfect. Here are some things open-source data discovery tools don’t always do well.

    Unknown Development Cycles

    Many B2B tools feature a regular and predictable development cycle. Some open-source projects are very organized, and others are less so. Regardless, there’s no guarantee that a feature or fix will come out on time—or even that there will be a roadmap to start with. The inherent unpredictability of the process can sometimes be frustrating.

    Enterprise Readiness

    As companies grow, their data environments become more complex at an exponential rate. Not all open-source data discovery tools can handle the complexity of a modern enterprise data environment. And of those that can, not all will be able to provide the detailed reporting and compliance options that companies need to meet their legal obligations.

    Data Discovery Tools: Opportunities

    With open-source tools, companies have some opportunities they wouldn’t necessarily have with paid tools.

    Opportunity to Influence Development

    As a user of an open-source tool, you’re part of the community developing it. While you still won’t have ultimate control over its development direction, you’ll likely have the ability to vote on next steps and generally have greater influence on the development process than you would over most paid tools. This can provide the opportunity to get the features you need faster than traditional development.

    Customization via Forking

    And if the community doesn’t prioritize your needs, you’re allowed to fork, or make a copy of, the underlying source code, allowing your company to continue development in the way it sees fit. That’s an option you’re typically never going to have with traditional software.

    Data Discovery Tools: Threats

    Of course, there are some downsides to open-source tools.

    Poor/Nonexistent Customer Support

    Because open-source tools are generally community-run projects where people work for free, customer support is not guaranteed. People, including other users, are often very helpful through online forums, but that often doesn’t rise to the same level of support you would get from just about any paid tool. And when you have a serious issue with your software, this problem can keep you from resolving it quickly. And as a reminder, 99 percent success in data discovery isn’t good enough, and could open you up to serious legal ramifications. If you’re having an issue with sensitive data discovery, failing to find a quick solution can be an expensive mistake.

    Rogue Developers

    While it’s unlikely that the developers of an open-source data discovery tool would insert malware or create serious security vulnerabilities, it’s not unheard of. But even if no one acts maliciously, there’s a real chance that the project will eventually be abandoned without warning. And abandoned software won’t receive security updates or new features and could leave you looking for a new solution once more.

    How Mage Data Helps with Sensitive Data Discovery

    If you’ve reached the end of the above SWOT analysis feeling that the strengths and opportunities far outweigh the weaknesses and risks, then there’s a good chance that there’s a great open-source sensitive data discovery tool out there for you. But that won’t be the case for all businesses. It doesn’t mean that the tools are bad, just that they are not a good fit for all business contexts.

    Remember that sensitive data discovery is the starting point of good data management. There are so many more things that need to be done to keep data safe and companies compliant. Here at Mage, we’ve developed a world-class AI-powered sensitive data discovery tool, that’s part of a larger suite of tools designed to manage data from discovery all the way to retirement. If that sounds more like what you need, sign up for a free consultation today to learn more about what Mage Data can do for you.

  • What to Look for in a Sensitive Data Discovery Tool

    What to Look for in a Sensitive Data Discovery Tool

    Selecting the right sensitive data discovery tool for your organization can be challenging. Part of the difficulty lies in the fact that you will only get a feel for how effective your choice is after purchasing and implementing it. However, there are things you can do to help maximize your return on investment before you buy by focusing your attention on the right candidates. By selecting your finalists based on their ability to execute on the best practices for sensitive data discovery, you can significantly increase the odds that your final choice is a good fit for your needs.

    Best Practices for Sensitive Data Discovery

    Of course, you can’t effectively select for the best practices in sensitive data discovery without a deep understanding of what they are and how they impact your business. While any number of approaches could be considered “best practices,” here are four that we believe are the most impactful when implementing a new sensitive data discovery system.

    Maximize Automation

    While more automation is almost always good, when it comes to sensitive data discovery, there’s a big difference between increasing automation and maximizing automation. In an ideal world, your data team would configure the rules for detecting personally identifiable information once and then spend their time on higher-value activities like monitoring and reporting. But there’s more to automation than just data types. Is the reporting automated? Does the system work well with the system that handles “right to be forgotten” requests? Any human-driven process is likely to fail when scaled up to millions or billions of data points. Success in this area means finding a solution that maximizes automation and minimizes the burden on your team.

    Merge Classification and Discovery

    Data must be classified before its insights can be unlocked. Despite its similarities to data discovery, data classification is sometimes handled by a different department with different tools. A potential downside of that approach is that a key stakeholder gets a report from each department and asks why the numbers don’t match. As a result, your team is forced to spend time reconciling the different tools’ output—which is not a great place to expend resources. An easy way to fix this problem is to use a single tool to perform both processes. If that’s not a viable approach, ensuring the tools are integrated to produce the same results can be a great way to ensure that your company has a unified and consistent view of its data.

    Develop a Multi-Channel Approach

    One trap that companies sometimes fall into is believing that the discovery process is over once data from outside the company is identified and appropriately secured on the company network. This approach neglects one of the biggest sources of risk when it comes to data: your employees. Are you monitoring your employee endpoints like laptops, desktops etc. for personally identifiable information? If so, are you able to manually or automatically remedy the situation? You won’t always be able to stop employees from making risky moves with data. However, with a multi-channel approach to sensitive data discovery, you can monitor the situation and develop procedures to limit the damage.

    Create Regular Risk Assessments

    Identifying your sensitive data is only the first step in the process. To understand your company’s overall risk, you must deeply understand the relative risk that each piece of sensitive information holds. For example, data moved across borders holds significantly more risk than that in long-term cold storage. Databases that hold customer information inherently have more risk than those holding only corporate information. To meaningfully prioritize your efforts in securing data and optimizing your processes, you need regular risk assessments. At scale, this can be difficult to do on your own—so your sensitive data discovery software either needs to do it for you or have a robust integration with a program that can.

    Choosing the Right Sensitive Data Discovery Software

    While there are many possible ways to select sensitive data discovery tool , the best practices we’ve covered offer a good starting place for most businesses. Remember that the features that one software package has vs. another is not necessarily as important as how those features support your business objectives. Maximizing automation, merging discovery and classification, developing a multi-channel approach, and creating regular risk assessments all have relatively little to do with the actual mechanics of data discovery—but they can all make a huge difference when building a healthy, secure company. There are a lot of different sensitive data discovery solutions that can solve your immediate problem. However, they may not do it in a way that holistically improves your business.

    Another important point is that data discovery is the first step in the data lifecycle that runs all the way to retirement. You could use a different tool for each stage of the process, but the end result would be a system with multiple independent parts that may or may not work well together. Ideally, you would be able to handle data throughout the lifecycle in one application. That’s where Mage Data comes in.

    How Mage Data Helps with Sensitive Data Discovery

    Mage Data’s approach to data security begins with robust data discovery through its patented Mage Sensitive Data Discovery tool, which is powered by artificial intelligence and natural language processing. It can identify more than 80 data types right out-of-the-box and can be configured for as many custom data types as you need.

    But that’s only the start of the process. Mage’s Data Masking solutions provide powerful static and dynamic masking options to keep data safe during the middle of its lifecycle, and Data Minimization tool helps companies handle data access and erasure requests and create robust data retention rules. Having an integrated platform that handles all aspects of data security and privacy can save you money and be far simpler to operate than having different platforms for different operations. We believe that it shouldn’t matter if you’re a small business or enterprise – your data solutions should just work. To learn more about how Mage Data can help you with sensitive data discovery, schedule a demo today.