简体   繁体   中英

Efficient data structure to hold employee's activities?

I have n excel files in a directory that stores employees' activities per day in a month. They have three columns: date, activity and category. I don't need the category.

I want to basically read the n excel files and output a word document that effectively puts all activities per employee ordered by date, for example:

Day 1 to day 5:

Day one:
Employee #1:
-Task a
-Task b
-Task c

Employee #2:
-Task a
-Task b
-Task c
...

Day 2: ...
...
...
Day 7 to day 11:
... ...

I want to know what data structure i could use to effectively hold this information so i can easily write the document i want. Right now, i am using an array of string to hold each excel line, and store all of them in a List, which then i store in a dictionary for each employee, the key being the username of each employee.

While i think these data structures are efficient themselves, they aren't too friendly for my main objective which is to print the data ordered per day for each employee, so maybe the key itself should be the date.

To recap: current data structure being used:

Dictionary<string,List<string[]>> dictActividades = new     Dictionary<string,List<string[]>>();

All lines in each employee's excel file are stored in a List. I really don't need the dictionary features because i could be reading everything back in order. I could be printing as soon as i am reading a employee's excel file but i would have to read the n excel files at once (although n is very small)

Thoughts?

EDIT: here's what i currently have:

        string directorioActividades = @"\\mar-fp01\mar_tecnologia$\Coordinacion de Apoyo a Usuarios\Informes\" + 
            fechaInicio.Year.ToString() + "\\" + fechaInicio.Year.ToString() + "-" + 
            fechaInicio.Month.ToString().PadLeft(2, '0');

        string[] archivos = Directory.GetFiles(directorioActividades, "*.xlsx");
        Dictionary<string,List<string[]>> dictActividades = new Dictionary<string,List<string[]>>();
        for (int j = 0; j < archivos.Length; j++)
        {
            List<string[]> actividades = new List<string[]>();
            string nombreArchivo = Path.GetFileNameWithoutExtension(archivos[j]);
            String excelConnectionString = @"Provider=Microsoft.ACE.OLEDB.12.0;" +
                "Data Source=" + archivos[j] + ";" +
                "Extended Properties=Excel 8.0; ";
            using (OleDbConnection con = new OleDbConnection(excelConnectionString))
            {
                OleDbCommand command = new OleDbCommand("Select * From [Actividades$]", con);
                con.Open();

                OleDbDataReader dr = command.ExecuteReader();
                int cantidadcolumnas = dr.FieldCount;

                string tipodatos = null;
                string[] filaDatos = new string[cantidadcolumnas];
                while (dr.Read())
                {
                    for (int k = 0; k < cantidadcolumnas; k++)
                    {
                        tipodatos = dr.GetFieldType(k).ToString();
                        if (tipodatos == "System.Int32")
                        {
                            filaDatos[k] = dr.GetInt32(k).ToString();
                        }
                        if (tipodatos == "System.String")
                        {
                            filaDatos[k] = dr.GetString(k);
                        }
                        if (tipodatos == "System.DateTime")
                        {
                            filaDatos[k] = dr.GetDateTime(k).ToShortDateString();
                        }
                    }
                    actividades.Add(filaDatos);
                }//while dr.read
            }
            dictActividades.Add(nombreArchivo, actividades);
        }//for archivos

While this code is quite short and uses the fewest data structures i can think of, printing is quite difficult because the key is the employee's username and not the date and the code should print every activity by every employee by date as the example in the format posted above

Performing logic on this data is going to be very unintuitive if you don't organize it semantically and include the logic in the models. Consider the quote:

"Smart data structures and dumb code works a lot better than the other way around."
- Eric Raymond, The Cathedral and the Bazaar

Your described structure contains all of the information you need:

Day one:
Employee #1:
-Task a
-Task b
-Task c

So you start with a Task object:

class Task
{
    // fields which describe a Task
}

(You might want to name it something slightly different to avoid confusion with the built-in Task type. Maybe Job or JobTask or something like that?)

An Employee has a collection of Tasks:

class Employee
{
    public IList<Task> Tasks { get; set; }
}

And a Day has a collection of Employees:

class Day
{
    public IList<Employee> Employees { get; set; }
}

(If this domain extends beyond just this one operation then you might even name things a bit differently. Technically these are "Employees" so much as they are "EmployeeTaskLogs" for example. The size and complexity of the domain will guide any such naming. But definitely put some thought into the most sensible names, that's the whole point of this exercise.)

Then your top-level consuming code would just have a collection of Days:

var days = new List<Day>();

You can further refine the business logic by creating custom collection types. For example, if you want to take advantage of a Dictionary of days:

class DayDictionary<T> : IDictionary<T> where T : Day
{
    // implement IDictionary<T> here
}

This is where you'd include logic for things like ensuring that any given collection has exactly 5 Day objects, for example.

Once you've defined your model structures, using them becomes trivial and the semantics of them become obvious.

I would suggest a simple class like this

class EmployeeActivity
{
    public string Employee { get; set; }
    public DateTime Date { get; set; }
    public string Activity { get; set; }
}

an no special data structure - just List<EmployeeActivity> . Once populated, you can use LINQ to perform ordering/grouping needed.

Imagine instead of you dictionary you populate what I'm suggesting

var empoyeeActivies = new List<EmployeeActivity>();
// Iterate excel files like in your code and populate the list

Now you can transform it to the way shown in your example with

var result = employeeActivities
    .GroupBy(a => a.Date, (date, dateActivities) => new
    {
        Date = date,
        DateActivities = dateActivities
            .GroupBy(a => a.EmployeeName, (employeeName, employeeActivities) => new
            {
                EmployeeName = employeeName,
                Activities = empoyeeActivities.OrderBy(a => a.Activity)
            })
            .OrderBy(a => a.EmployeeName)
    })       
    .OrderBy(a => a.Date);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM