Last post May 27, 2011 09:43 AM by atconway
May 25, 2011 10:56 AM|hambi|LINK
I am developing a site that could have a large number of users that generate/upload a lot of data. I have two questions that, to me, are both to do with architecture but could just be data structure questions:
1. I have two complex data items (X and Y) that represent different data but have similar fields (by complex I mean that they are not simple int/string... but while X and Y are different data items with different function within the site they have very similar
data structures and could be editted/created in a common manner) . I guess that the norm would be to have two separate tables, one for each data item. However, I could create a single table that would be sufficient to store the data for both X and Y data and
would allow me to determine which type of item is held in a particular record. This would allow me to have common pages to display and edit data whether it is type X or Y data yet still support the distinct functionality associated with each type of data.
I can see a lot of advantages of using a single table for both types of data rather than two tables. My main concern is performance. Both types of data are central to the site and there will be large volumes of each type of data and the site needs to search
the data. Having a single table effectively doubles the amount of data in the table (compared to the each tables if there were separate tables for each type of data).
2. Users can upload photographs and I am considering the best way to store the photographs. The site may also generate some files associated with each user. This there is some logic to creating a directory for each user. Is there any best practice way to
deal with user data and how to store it. There may be a very large number of users and I want to adopt a directory structure, etc. that will be robust and will migrate when the data exceeds the capacity of a single server.
May 26, 2011 06:49 AM|GPankaj|LINK
Normalization usually involves dividing large tables into smaller (and less redundant) tables and defining relationships between them. The objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table
and then propagated through the rest of the database via the defined relationships. I would suggest you use Normalization. selective denormalization can subsequently be performed for performance reasons.
You can store the photograph in the database in blog format. It would take little longer time to display the images up front as converting the data would take time. If the Images are stored in the file system and you store the reference in a database table,
then you need to give preper permission to the user to write data to your file system.
May 27, 2011 09:43 AM|atconway|LINK
Answer to #1: If you are referring to (1) or (2) tables in the database
then this is a decision to normalize further the schema to make (2) tables or (1). If you believe there are several instances where it makes more sense to have the data normalized into (2) different tables, then I would go that direction and just use a JOIN
when appropriate in SQL to seamlessly bring that data back together. It is really more of a database question that one of coding, because by the time the app receives that data via whatever access method you choose (ADO.NET, etc) it can be in what ever form
you want it to be for your app and does not need to resemble the database structure at all.
Answer to #2: Use FILESTREAM in SQL Server 2008. The following (2) links will help explain it:
Hope this helps!