Microsoft.NET

……………………………………………….Expertise in .NET Technologies

The .NET File System Object Model

Posted by Ravi Varma Thumati on September 22, 2009

The .NET Framework doesn’t change the structure of the file system, nor does it build a new layer on top of it. More simply, but also more effectively for developers, it supplies a new object model for file system-related operations. A managed application can work with files and directories using high-level methods rather than low level understanding of the file system. This article provides an overview of methods and classes contained in the System.IO namespace.

The .NET Framework reworks, rationalizes, and simplifies key portions of the Win32 API, within the .NET Framework with very few exceptions… Microsoft has redesigned the whole Win32 API and made it available to programmers in an object-oriented fashion. In this article, you will see how to manage paths as a special data type with ad-hoc methods and properties; how to work to retrieve as much information as possible about files and directories, and how to read and write files.

Managing Files

The .NET Framework uses System.IO as the main namespace to work with file systems. Within this namespace, you can identify three groups of related classes that accomplish the following tasks:

  • Retrieve information and perform basic operations on files and directories
  • Perform string-based manipulation on paths
  • Read and write operations on data streams and files

The .NET Framework provides I/O functionality through a few global static classes, such as File, Directory, and Path. You declare these classes as static (or shared in Visual Basic .NET) and in order to use them, you don’t need to create specific instances of the classes. File, Directory, and Path are just the repository of global, type-specific functions that you call to create, copy, delete, move, and open files and directories. All of these functions requires a file or a directory name to operate. To write or read files, you also have specific classes to manage streams and bytes at your disposal.

If you’re going to work with files within a .NET managed application, chances are good that you have to use the methods from the File class. So let’s start by taking a look at the methods exposed by this class (see Table 1).

The path parameter that all methods require can indicate a relative or absolute path. A relative path is interpreted as relative to the current working directory. To obtain the current working directory, you use the GetCurrentDirectory method provided by the Directory class. Any methods above that perform write operations that will create the specified file if it does not exist. If the file does exist, it will be overwritten as long as it is not marked read-only.

Each time an application invokes a method on the File class, a security check is performed on the involved file system elements. The check verifies that the current user has the permission to perform the specified operation. If you use the same file or directory several times, this embedded security check might result in a slight performance hit. For this and other reasons, the .NET Framework defines an instance-specific type to wrap the functionality of files called the FileInfo class. If you need to access a file in a repeated fashion, you can use the FileInfo class to perform the security check only once. Should you always use FileInfo and disregard the File class? Well, consider that, in general, the methods of the global classes have an internal implementation that results in more direct code. For this reason, global objects are preferable for one-shot calls.

If you look at the overall set of functionality both provide, the FileInfo class looks very similar to the static File class. However, the internal implementation and the programming interface is slightly different. The FileInfo class works on a particular file and requires that you instantiate the class before you access its methods and properties.

FileInfo fi = new FileInfo(“mydoc.txt”);

When you create an instance of the FileInfo class, you specify a filename, either fully or partially qualified. The filename you indicate is only checked for the name consistency and not for existence. If the filename you indicate through the class constructor is unacceptable an exception is thrown. Common pitfalls are colons in the middle of the string, invalid characters, blank names, or names longer than 256 characters. Table 2 lists the properties of the FileInfo class.

The methods available for the FileInfo class are summarized in Table 3. As you can see, you can group methods into two categories: methods to perform simple stream-based operations on the contents of the file, and methods to copy or delete the file itself.

The FileInfo class represents a logical wrapper for a system element that is continuously subject to concurrent changes. Can you be sure that the information returned by the FileInfo object is always up to date? Properties such as Exists, Length, Attributes, and LastAccessTime can easily contain inconsistent values if other users may make changes concurrently.

When you create an instance of FileInfo, no information is actually read from the file system. As soon as you attempt to read the value of one of the aforementioned critical properties, the class invokes the Refresh method, reads the current state of the file, and caches that information. For performance reasons, though, the FileInfo class doesn’t automatically refresh the state of the object each time properties are read. It does that only the first time that it reads one of the properties.

To force this built-in behavior, you should call Refresh whenever you need to read up-to-date information about the attributes or the length of a file. Whether or not you need to refresh this data depends greatly on the needs of your application. Under the hood, the Refresh method makes a call to the Win32 FindFirstFile function and uses the information contained in the returned WIN32_FIND_DATA structure to populate the properties of the FileInfo class. You need to consider whether or not the application needs the overhead of calling this API function.          

Table 1: Methods exposed by the File class.

Method Name Description
AppendText Creates and returns a stream object for the specified file. The stream allows you to append UTF-8 encoded text.
Copy Copies an existing file to a new file. The destination cannot be a directory name or an existing file.
Create Creates a new file.
CreateText Creates a new file (or opens one if a file already exists) for writing UTF-8 text.
Delete Deletes the file specified.
Exists Determines whether the specified file exists.
GetAttributes Gets the attributes of the file.
GetCreationTime Returns the creation date and time of the specified file.
GetLastAccessTime Returns the last access date and time for the specified file.
GetLastWriteTime Returns the last write date and time for the specified file.
Move Moves a specified file to a new location. Also provides the option to specify a new filename.
Open Opens a file on the specified path.
OpenRead Opens an existing file for reading.
OpenText Opens an existing UTF-8 encoded text file for reading.
OpenWrite Opens an existing file for writing.
SetAttributes Sets the specified attributes for the given file.
SetCreationTime Sets the date and time the file was created.
SetLastAccessTime Sets the date and time the specified file was last accessed.
SetLastWriteTime Sets the date and time that the specified file was last written.

 

 
Table 2: Properties of the FileInfo class.

Property Name Description
Attributes Gets or sets the attributes of the current file.
CreationTime Gets or sets the time when the current file was created.
Directory Returns a DirectoryInfo object representing the parent directory.
DirectoryName Gets a string representing the directory’s full path.
Exists Indicates whether a file with the current name exists.
Extension Gets the string representing the extension of the filename, including the period (.).
FullName Returns the full path of the current file.
LastAccessTime Gets or sets the time when the current file was last accessed.
LastWriteTime Gets or sets the time when the current file was last written.
Length Returns the size in bytes of the current file.
Name Returns the name of the file.

 

 
Table 3: Methods of the FileInfo class.

Method Description
AppendText Creates and returns a stream object for the current file. The stream allows you to append UTF-8 encoded text.
CopyTo Copies the current file to a new file.
Create Creates a file. It’s a simple wrapper for the File.Create method.
CreateText Create a file and returns a Stream object to write text.
Delete Permanently deletes the current file. Fails if the file is open.
MoveTo Moves the current file to a new location, providing the option to specify a new filename.
Open Opens the file with various read/write and sharing privileges.
OpenRead Creates and returns a read-only stream for the file.
OpenText Creates and returns a Stream object to read text from the file.
OpenWrite Creates and returns a write-only Stream object that you can use to write text to the file.
Refresh Refreshes the information that the class can have about the file.
ToString Returns a string that represents the fully qualified path of the file.

 

Copying and Deleting Files

To make a copy of the current file, you can use the CopyTo method, which comes with two overloads. Both overloads copy the file to another file but the first overload just disallows overwriting, while the other gives you a chance to control overwriting through a Boolean parameter.

FileInfo fi = fi.CopyTo(“NewFile.txt”, true);

Notice that both methods require that the first argument be a filename. It can’t be the name of a directory where you want the file to be copied. If you use a directory name, that will be the name of the output file.

The Delete method permanently deletes the file from disk. Using this method, there is no way to programmatically send the deleted file to the recycle bin. To put a file in the recycle bin you must resort to creating a .NET wrapper for the Win32 API function that does that. The API function you need is named SHFileOperation.

The Attributes property indicates the file system attributes of the given file. In order to set or read an attribute, the file must already exist and the application must have access to it. To write an attribute value to a file, you must also have a write permission, otherwise the FileIOPermissionAccess exception is raised. The attributes of a file are expressed using the FileAttributes type. (See Table 4.)

The values in the table correspond to those defined in the Win32 SDK. Notice that not all attributes are applicable to both files and directories. You set attributes on a file using code as in the code snippet below.

// Make the file read-only and hidden
FileInfo fi = new FileInfo(“mydoc.txt”)
fi.Attributes = FileAttributes.ReadOnly | 
                FileAttributes.Hidden;

Note you cannot set all of the attributes listed in Table 4 through the Attributes property. For example, the system assigns the Encrypted and the Compressed attributes only if the file is contained in an encrypted or compressed folder. Likewise, you can give a file a reparse point or you can mark is as a sparse file only through specific API functions and only on NTFS volumes.

Working with Directories

To manage a directory as an object you use the Directory global object or the DirectoryInfo class. The global Directory class exposes static methods for creating, copying, and moving directories and for enumerating their files and subdirectories. Table 5 lists the methods on the Directory class.

Note that the Delete method has two overloads. By default, it deletes only empty directories and throws an IOException exception if the directory is not empty or marked read-only. The second overload includes a Boolean argument that, if set to true, enables the method to recursively delete the entire directory tree.

// Clear a directory tree
Directory.Delete(dirName, true);

The DirectoryInfo class represents the instance-based counterpart of the Directory class and works on a particular directory.

DirectoryInfo di = new DirectoryInfo(@”c:\”);

To create an instance of the DirectoryInfo class, you specify a fully qualified path. Just as for FileInfo, the path is checked for consistency but not for existence. Note that the path can also be a filename or a Universal Naming Convention (UNC) name. If you create a DirectoryInfo object passing a filename, the class will use the directory that contains the specified file. Table 6 shows the properties available with the DirectoryInfo class.

The Name property of the file and directory classes is read-only and you cannot use it to rename the corresponding file system’s element. The methods you can use on the DirectoryInfo class are listed in Table 7.

The GetFileSystemInfos method returns an array of objects, each of which points to a file or a subdirectory contained in the directory bound to the current DirectoryInfo object. Unlike GetDirectories and GetFiles methods which simply return the names of subdirectories and files as plain strings, GetFileSystemInfos returns a strongly-typed object for each entry?either DirectoryInfo or FileInfo. The return type of the method is an array of FileSystemInfo objects.

public FileSystemInfo[] GetFileSystemInfos()

FileSystemInfo is the base class for both FileInfo and DirectoryInfo. GetFileSystemInfos has an overloaded version that can accept a string with search criteria.

Let’s see how to use the file and directory classes to build a simple console application that lists the contents of a directory. The full source code is presented in Listing 1.

GetFileSystemInfos accepts a filter string that you can use to set some criteria. The filter string can contain wild card characters such as ? and *. The ? character is a placeholder for any individual character, while * represents any string of one or more characters. A bit more problematic is selecting all files that belong to one group or another. Likewise, there’s no direct way to obtain all directories plus all the files that match certain criteria. In similar cases, you must query each result set individually and then combine them together in a single array of FileSystemInfo objects. The following code snippet shows how to select all the subdirectories and all the aspx pages in a given folder.

FileSystemInfo fsiDirs = (FileSystemInfo[]) 
        di.GetSubdirectories();
FileSystemInfo fsiAspx = (FileSystemInfo[]) 
        di.GetFiles(“.aspx”);

You can fuse the two arrays together using the methods of the Array class.

Working with Paths

Although paths are nothing more than strings, it’s a common feeling that they deserve a tailor-made set of functions to makes paths easier to manipulate. The Path type provides programmers with the unprecedented ability to perform operations on instances of a string class that contain file or directory path information. Path is a single-instance class that contains only static methods. A path can contain either absolute or relative location information for a given file or folder. If the information about the location is incomplete and partial, then the class completes it using the current location, if applicable.

The members of the Path class let you perform everyday operations such as: determining whether a given filename has a certain extension, changing the extension of a filename leaving all the remainder of the path intact, combining partial path strings into one valid path, and more. The Path class doesn’t work in conjunction with the operating system and should be simply viewed as a highly specialized string manipulation class.

The members of the Path class never interact with the file system to verify the correctness of a filename. Even though you can combine two strings to get a valid directory name, that would not be sufficient to actually create that new directory. On the other hand, the members of the Path class are smart enough to throw an exception if they detect that a path string contains invalid characters. Table 8 lists the methods of the Path class.

It’s interesting to notice that any call to GetTempFileName promptly results in creating a zero-length file on disk and specifically in the system’s temporary folder (i.e., C:\Windows\Temp). This is the only case in which the Path class happens to interact with the operating system.

Table 4: The FileAttributes enumeration.

Attribute Description
Archive Indicates that the file is an archive.
Compressed The file is compressed.
Device Not currently used. Reserved for future use.
Directory The file is a directory.
Encrypted The file or directory is encrypted. For a file, this means that all data in the file is encrypted. For a directory, this means that encryption is the default for newly created files and directories but not necessarily that all current files are encrypted.
Hidden The file is hidden and doesn’t show up in directory listings.
Normal The file has no other attributes set. Note that this attribute is valid only if used alone.
NotContentIndexed The file should not be indexed by the system indexing service.
Offline The file is offline and its data is not immediately available.
ReadOnly The file is read-only.
ReparsePoint The file contains a reparse point, which is a block of user-defined data associated with a file or a directory. Requires an NTFS file system.
SparseFile The file is a sparse file. Sparse files are typically large files whose data are mostly zeros. Requires an NTFS file system.
System The file is a system file, part of the operating system or used exclusively by the operating system.
Temporary The file is temporary and can be deleted by the application any time soon.

 

 
Table 5: Methods on the Directory class.

Method Name Description
CreateDirectory Makes sure that the specified path exists in all of its included subdirectories.
Delete Deletes a directory and, optionally, all of its contents.
Exists Determines whether the given directory exists.
GetCreationTime Gets the creation date and time of the specified directory.
GetCurrentDirectory Gets the current working directory of the application.
GetDirectories Returns an array of strings filled with the names of the child subdirectories of the specified directory.
GetDirectoryRoot Gets volume and root information for the specified path.
GetFiles Returns the names of files in the specified directory.
GetFileSystemEntries Returns an array of strings filled with the names of all files and subdirectories contained in the specified directory.
GetLastAccessTime Returns the date and time the specified directory was last accessed.
GetLastWriteTime Returns the date and time the specified directory was last written.
GetLogicalDrives Returns an array of strings filled with the names of the logical drives found on the computer. Strings have the form “<drive letter>:\”.
GetParent Retrieves the parent directory of the specified path. The directory is returned as a DirectoryInfo object.
Move Moves a directory and its contents to a new location. An exception is thrown if you move the directory to another volume or if a directory with the same name exists.
SetCreationTime Sets the creation date and time for the specified directory.
SetCurrentDirectory Sets the application’s current working directory.
SetLastAccessTime Sets the date and time the specified file or directory was last accessed.
SetLastWriteTime Sets the date and time a directory was last written to.

 

 
Table 6: Properties of the DirectoryInfo class.

Property Description
Attributes Gets or sets the attributes of the current directory.
CreationTime Gets or sets the creation time of the current directory.
Exists Determines whether the directory exists.
Extension Returns the extension (if any) in the directory name.
FullName Returns the full path of the directory.
LastAccessTime Gets or sets the time when the current directory was last accessed.
LastWriteTime Gets or sets the time when the current directory was last written.
Name Returns the name of the directory bound to this object.
Parent Returns the parent of the directory bound to this object.
Root Returns the root portion of the directory path.

 

 
Table 7: Methods of the DirectoryInfo class.

Method Name Description
Create Creates a directory. It’s a simple wrapper for the Directory.Create method
CreateSubdirectory Creates a subdirectory on the specified path. The path can be relative to this instance of the DirectoryInfo class.
Delete Deletes the directory.
GetDirectories Returns an array of DirectoryInfo objects, each pointing to a subdirectory of the current directory.
GetFiles Returns an array of FileInfo objects, each pointing to a file contained in the current directory.
GetFileSystemInfos Retrieves an array of FileSystemInfo objects representing all the files and subdirectories in the current directory.
MoveTo Moves a directory and all of its contents to a new path.
Refresh Refreshes the state of the DirectoryInfo object.

 

 
Table 8: Methods of the Path class.

Method Name Description
ChangeExtension Changes the extension of the specified path string.
Combine Concatenates two path strings together.
GetDirectoryName Extracts and returns the directory information for the specified path string.
GetExtension Returns the extension of the specified path string.
GetFileName Returns filename and extension of the specified path string.
GetFileNameWithoutExtension Returns the filename of the specified path string without the extension.
GetFullPath Returns the absolute path for the specified path string.
GetPathRoot Returns the root directory for the specified path.
GetTempFileName Returns a unique temporary filename and creates a zero-byte file by that name on disk.
GetTempPath Returns the path of the temporary folder.
HasExtension Determines whether the specified path string includes an extension.
IsPathRooted Returns a value that indicates whether the specified path string contains an absolute path.

 

 

 
Listing 1: A directory listing utility
using System;
using System.IO;

public class TextDirs
{
  public static void Main(string[] args)
  {
    string dirName = @”c:\”;
    if (args.Length > 0)
      dirName = args[0];

    DirectoryInfo di = new DirectoryInfo(dirName);
    foreach(FileSystemInfo fsi in 
                          di.GetFileSystemInfos(“*.*”))
    {
      string text = “”;

// Creation time
text += fsi.CreationTime.ToString() + ‘\t’;

// Type and Size
if (fsi is DirectoryInfo)
{
   text += “<DIR>” + ‘\t’;
   text += ”     ” + ‘\t’;
}
else
{
         text += ”      ” + ‘\t’;
         FileInfo fi = (FileInfo) fsi;
         text += String.Format(fi.Length.ToString(), 
               “{0}”) + ‘\t’;
      }

// Name
text += fsi.Name;
Console.WriteLine(text);
    }
  }
}

I/O with Files

In the .NET Framework, the atomic element to read from, or write to, is the stream. A stream abstracts the contents of a variety of potential data stores, including local and network disk files, memory, and databases. You can read or write a Stream object using a couple of tailor-made tools?the reader and the writer.

A reader reads one chunk of information at a time. The structure of the data read depends on the particular reader and the underlying stream. For example, a text reader will read rows of text recognizing the carriage return/linefeed pair as the separator between chunks. Likewise, the binary reader will process every single byte in the stream as the XML reader moves from one node to the next. The reader operates in a read-only, forward-only way. You can’t move back to an already processed, or skipped, chunk of data; nor can you edit the current data the pointer references.

In the .NET Framework, you find available quite a few specialized readers including TextReader, BinaryReader, XmlReader, and database-specific readers such as SqlDataReader and OracleDataReader. Although all of these reader classes have a common subset of functions, and an overall similar way of working, they don’t derive from the same base class. Reader classes work on top of streams. Depending on the implementation of each individual class, the stream may be passed explicitly as a constructor argument or through its file name or URL.

The Stream class supports three basic operations: reading, writing, and seeking. Reading and writing operations entail transferring data from a stream into a data structure and vice versa. Seeking consists of querying and modifying the current position within the stream of data.

The .NET Framework provides a number of predefined Stream classes including FileStream, MemoryStream, and the fairly interesting CryptoStream, which automatically encrypts and decrypts data as you write or read. Each different storage implements its own stream by deriving from the base Stream class. The StreamReader class is a generic reader class for any type of stream. Finally, the StringReader class lets you read a string of text using the same programming interface as readers that operate on data stores.

You transform the contents of a file into a stream using the FileStream class. The following code shows how to open a file that you want to read:

FileStream fs = new FileStream(filename, 
FileMode.Open, FileAccess.Read);

Streams supply a rather low-level programming interface which, although functionally effective, is not always apt for classes that need to perform more high-level operations such as reading the whole content of a file or a single line.

To manipulate the contents of a file as a binary stream, you just pass the FileStream object down to a specialized reader object that knows how to handle it.

BinaryReader bin = new BinaryReader(fs);

If you want to process the file’s contents in a text-based way, then you can use the StreamReader class, as shown below.

StreamReader reader;
reader = new StreamReader(fileName);
reader.BaseStream.Seek(0, SeekOrigin.Begin);
string text = reader.ReadToEnd();
reader.Close();

To write files, you often use the StreamWriter class and access its underlying stream, which can also be an encrypted stream. The following code snippet shows how to create a file.

StreamWriter writer = new StreamWriter(file);
writer.WriteLine(text);
writer.Close();

Creating binary files that contain images or raw data doesn’t happen along different guidelines. You just use BinaryWriter (or BinaryReader for reading) as the writer object and its ad hoc set of methods.

All reader classes have a writer counter class. So you have a StreamWriter class acting as a generic writer for streams and more specific classes such as TextWriter, XmlWriter, BinaryWriter, and StringWriter. Curiously, the .NET Framework does not have a sort of SqlDataWriter class which would configure a server cursor. Server cursors are not supported as of version 1.1 of the .NET Framework.

Summary

Although the substance of the underlying file system is not something that changed with .NET, the platform that determines the way in which you work with the constituent elements of a file system?files and directories, changed quite a bit.

The introduction of streams as programmable objects is a key step in the sense that it unifies the API necessary to perform similar operations on conceptually similar storage media. Another key enhancement is the introduction of reader and writer objects. They provide a kind of logical API by means of which you read and write any piece of information in nearly identical ways. The .NET Framework also provides a lot of facilities to perform the basic management operations with files and directories, including path functions and common-use methods. In just one slogan, with .NET way of working with the file system is easier and more effective. Just do it

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: