Research Portfolio Post #6: Quantitative Data Sources

I hope to explain the variances in cyberwarfare tactics utilized by international actors.

By finding empirical data and creating a matrix of the differing cyber tactics, and then applying them to international actors involved in the cyberwarfare landscape, I will be able to see how actors employ differing methods to achieve their cybersecurity goals. To achieve this, my research question needs meet the large-n standards of framing:

“What explains the variances in cyberwarfare tactics utilized by international actors?”

Data that has been collected to meet this question comes primarily from government and business sources. There is, however, historical difficulty in collecting empirical data points of this type as both state and business actors are often reluctant to release information proving that their cyber landscape has been penetrated. The first database chosen for extrapolation comes from the Home Office of United Kingdom and presents cybercrime committed against businesses based in England and Wales.[1] The second database comes from IPSOS Mori Social Research Institute and presents information through survey format by asking businesses the type of incursions which occurred as well as the damages that resulted from the hack.[2]

Managing this data in a meaningful way is where the answer to my research question will be able to be derived. These sources only provide information based around the United Kingdom so databases from other locales would also be needed to create a comprehensive large-n analysis. However, these databases alone present many of the independent variables which can be explored to answer the dependent variable presented in the research question.

Cyber Incursion Against the United Kingdom

This dataset presents a possible example of how a statistical analysis can be presented. While variables such as cost incurred, target, monetary cost, and number of incursions are important, the source variable is the most important as once an analysis is done of other targets, the perpetrators and their tactics can start to be drawn out. As presented by Ryan Maness’ codebook, all of these variables can be presented via numbers as at this moment, there are a finite number of incursion methods which can be given a numerical value.[3] While this dataset is restricted to the United Kingdom, it can be expanded to other actors using the same methodology. However, as said previously, a weakness in these databases are the lack of proper reporting by international actors who are reluctant to show they were penetrated.

[1]“Crime Against Business,” Home Office of the United Kingdom, (2017), doi: https://www.gov.uk/government/collections/crime-against-businesses

[2]“Commercial Victimisation Survey,” IPSOS Mori Social Research Institute, (2017), pp. 229-237. Doi: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/704095/commercial-victimisation-survey-technical-report-2017.pdf.

[3]Ryan C. Maness, Brandon Valeriano, and Benjamin Jensen, “Codebook for the Dyadic Cyber Incident and Dispute Dataset Version 1.1,” available at: https://drryanmaness.wixsite.com/cyberconflcit/cyber-conflict-dataset

2 thoughts on “Research Portfolio Post #6: Quantitative Data Sources

  1. Tristan — it sounds like you are off to a good start in conceptualizing your project for the large-n methodology. The data sources that you discuss here are clearly relevant to your project. Like many projects/topics, we often have to compile the data for one or more variables from a variety of sources (and although you don’t need a complete dataset for the first research design, knowing that this might well be your process as well as knowing about the availability of data sources, is an important step at this stage). As you continue your work it would be good to refine your operationalization of your DV some more. You know that your question is “What explains the variances in cyberwarfare tactics utilized by international actors?” In your post you note various types of information that are available in the sources you have examined, but what is the precise operationalization of your DV? How will you capture the concept of “tactics utilized” as an interval/ratio numerical indicator?

  2. Tristen says:

    Fellow Tristan –

    I am thoroughly impressed by what you have found considering what you have stated about empirical data regarding cybersecurity breaches – which is quite fascinating in it of itself. I also like your utilization of Maness’ codebook and thusly how you intend to apply cross-nationally to other nations aside from the United Kingdom, which I believe will be essential to your research project.

    However, I, like Professor Boesenecker, question how this helps you operationalize your dependent variable of “cyber warfare tactics” if what you are examining here is simply breaches in ta respective defense system. Could you perhaps change this DV to something similar if certain tactics are too difficult to find? It seems like the data and sources that you have lean much more towards breaches, and maybe that is what you could explain the variance in if that data related to tactics is too difficult to find/measure. I feel like that would not only be more helpful in the sense of data you collected but also may provide a puzzle related to the breaches themselves, such as what factors explain variance in the outcome of a breach? That itself is a puzzle and would allow you to look small and large as well as utilize a dependent variable that may be easier to measure that what you have previously stated.

Leave a Reply

Your email address will not be published. Required fields are marked *