Proactive Versus Reactive Data Management

The following outlines the basic technology approaches to manage file based workflows that generate unstructured data. The technology approaches are not mutually exclusive and companies often deploy a combination of these technologies.
 
Reactive / After the Fact (Data Creation)   Proactive / At Time of Data Creation

 
Does not address the root cause of the problem   Targets the root cause of problem
     
Attempt to makes sense of data after the fact   Introduce organization at time of data creation
     
messy filing cabinet
  • Mess has already been created
  •   organized filing cabinet Sequence 45 folder
  • Hence file server names like "landfill"
  •   Sequence 46
          Sequence 47 Shot 5
         
    Make Sense Of Existing Data Mess   Introduce Structure Via Silo Or Database

     
    • Search, Classification, Data Mining, & Text Analytics
    • Difficulty in determining why data is even being stored in the first place (what project, business objective, etc.)
     
    • Specialized catalog, purpose built or project oriented databases
    • Typically catalog data into a database
    • Catalog system may reference native file system
    • Application silo holds data hostage
    • Data outside of catalog is not managed
         
    Store Data Mess More Efficiently   Introduce Structure Into Underlying File Systems

     
    • Data Deduplication
    • Stores multiple copies more efficiently, makes no attempt to avoid problems and root cause associated with multiple copies
     
    • Standardize file system structure
    • File system becomes index to data
    • Hierarchy provides a simple mechanism for relating data
    • Naming conventions improve the quality of meta data
    • Normalizes the organization of file systems for efficiency and assured quality
    • Detach data management practices from applications and infrastructure
    • The collection of underlying file systems then comprise the largest "data silo" thereby removing limitations on file types and formats that can be managed.
       
    Store Data Mess More Cost Effectively  

     
    • Attempts to migrate data to most cost effective tier of storage
    • Hierarchical Storage Management
    • Information Lifecycle Management