Thankfully, Excel offers two handy features that simplify the identification and removal of duplicate data points from a file! For instance, if I am looking at a data set on the number of hamsters across the United States and I see that Wisconsin has two data points, both of which are 50,000 (totally fabricated!), then I can infer that the data set has mistakenly included two duplicate values for Wisconsin.So why does this matter? It matters because duplicate data points may inadvertently lead to miscalculation or misunderstanding of the data. The appearance of duplicates does not necessarily mean the entire data set is completely wrong – only that the data set may require a closer eye and some additional clean-up work as do most data sets. Duplicates are exactly what they sound like: exact copies of the same data point. Duplicate data points are probably one of the most difficult to spot unless you’re lucky. These can include blank values, outlier data points, data label misspellings, and so on. There may be more than a few data points to double-check as you review and clean a data file. Stay tuned for Diana’s experiences, tips, and tricks with finding, analyzing and visualizing data. And now she is bringing her trials, tribulations, and expertise with data to you in a monthly blog, Tips with Diana. The person that SAGE Publishing - the parent of MethodSpace - turns to when it has questions is Diana Aleman – editor extraordinaire for SAGE Stats and U.S. Collecting, analyzing, and reporting with data can be daunting.