| Unstructured Data |
“If we don’t carefully name and organize our file system data, the only guarantee is we will quickly lose track of it” - a major CG Animation Studio |
Definition: |
| Data stored on the file system outside of any database as an independent, stand alone file. |
Background: |
| Research Analysts estimate that 80-85% of all data is unstructured. It is further estimated that 30-40% of that data is no longer of any use to the organization that generated or stored it. Nevertheless, corporations continue to buy nearly 50% more storage every year to hold their digital assets. Since only roughly 10% of the storage management cost is the actual hardware acquisition, corporations have a material, growing problem to deal with related to: |
|
Why does this happen? |
| Organizations have hundreds or thousands of users creating, extending, copying, or distributing data every day of every year. Because the user community is typically measured according to their impact on their core business, not their ability to manage data, the creator/owners of the data spend little time or effort to manage the data growth. Unfortunately, the IT groups with the mandate to manage the corporate data assets do not have the information required about the company’s business objectives, schedules, or substantial knowledge necessary to bring precision to the process. Thus, each small group of users organizes their data on the company’s shared drives with varying degrees of organization, consistency, and discipline. Since there is no database to force structure, there is little knowledge within the internal IT group to determine a better structure, and there is no obvious technology suite available to support a more “standard” structure across the organization, the data simply collects across a host of file servers based on very localized schema and temporary project definitions. Not only does this present a problem to later finding the data, users also compound the problem by copying, or emailing their version of file X all around the network as the data migrates between groups or across boundaries of responsibility. Poor data management practices are a root cause of storage growth in the catch-all “unstructured” category. |

