Review Comment:
To reference particular parts of the text precisely, I use page:line (for instance, 1:46).
# Originality
Core to the originality of this work is its practical focus. The authors bring together—in a novel way—music ontologies to accurately (but flexibly) record melodic patterns, a data transformation pipeline that creates a knowledge graph of melodic patterns across pieces, a SPARQL query endpoint for more technical users or those happy to adapt the template queries provided, and a user interface for less technically inclined users. I congratulate the authors on working so adeptly across disciplinary and technological boundaries, and the result will be of interest both to musicologists interested in melodic similarity, as well as more technical experts working with ontologies and knowledge graphs more generally.
The authors do an excellent job of reviewing existing work in the music ontology and analysis space, and identifying where their own work on formalising melodic patterns fits in. They build up this context in a logical fashion from first principles (music analysis -> musical patterns -> patterns in folk songs -> ontologies). There are some minor caveats and suggested improvements in the section on “Quality of Writing” below.
The authors correctly note that much of the ontological groundwork in this space focusses on music cataloguing and classification, more than structured musical annotations (3:29). The choice of the authors to build on the Music Annotation Pattern ODP seems like a sound choice, and will hopefully enhance its future interoperability with other kinds of pattern analysis.
The focus of this taxonomy on “n-grams of scale degree” (2:29) is sensible, both because it clearly suits the Irish folk repertory the authors have worked with, and because it can intuitively and broadly be applied to other repertories that are either fully monophonic or have easily separable contrapuntal melodic lines.
While using n-grams as the basis for music analysis isn’t new, the authors distinguish their approach in the specific choices made: 1) a focus on n-grams with a specified range of length, 2) using accented notes to define these pitches, potentially better highlighting “structural” pitches, though of course this will vary by repertory, and 3) the use of diatonic notes in defining patterns: points 2 and 3 have the effect of simplifying the resultant patterns and potentially making them easier to identify despite note-level variations caused by “modal ambiguity” (3:6) or other fine-grained compositional choices like embellishment.
A core contribution of this work is the development of a software pipeline to create a knowledge graph from patterns, giving a structured way of exploring not just what patterns occur in a repertory, but how they are connected. The choice to build on Smashub rather than start from scratch is sound and well-reasoned.
# Significance of Results
Overall, this is an impressive set of work that brings together not just ontological foundations, but data transformation pipelines, a SQARQL query endpoint, and a graphical user interface. The authors are to be commended on their contribution to the study of patterns in Irish folk music and to computational musicology and Music Information Retrieval more broadly.
Section 1.2 does a good job of explaining the potential for this work to support the identification of “tune families” and other forms of melodic similarity and transformation. While it is mentioned briefly that this kind of ngram analysis can be adapted to other repertories, it would be worth mentioning one or two concrete examples to help readers envisage how they might translate this work to other corpora.
Section 4 does a thorough job of stepping through how the eXtreme Design methodology was applied and how the Competency Questions were addressed concretely through carefully-crafted SPARQL queries. The inclusion of a collection of sample SPARQL queries to demonstrate the flexility of the knowledge graph in addressing different types of questions is significant.
Consultation with musicologists through various co-design processes is a strength of the work, which extends to the GUI. Since musicologists would need the help of more technical specialists in crafting SPARQL queries, the development of a simple user-interface should enhance the overall accessibility of the work. I particularly like the network view as a way to simply and intuitively show where musical works share melodic patterns.
An implicit theme across all this work is democratising access to corpora of linked musical patterns, and music analysis more broadly. This theme is hinted at, but it is worth addressing more explicitly in the latter part of the article where the narrative shifts from constructing the knowledge graph to querying it to answer research questions. For instance, the opening of Section 5 could be much stronger in articulating why and how a Graphical User Interface will broaden the impact of this research. SPARQL endpoints are often under-utilised by non-technical domain experts so both the GUI, and the set example SPARQL queries have an important role to play that should be highlighted.
Similarly, given the shifts in the technology landscape since this work was undertaken, it might be helpful to provide some preliminary comment—whether positive or negative—on where advances in LLMs and associated technologies could further democratise access to pattern analysis in musical corpora in the future, be this in identifying these patterns or subsequently querying them. This is particularly salient given the objective of the authors to support “open-ended integration with future work” (2:9). There are a number of projects that leverage LLMs to construct SPARQL queries (https://data.carnegiehall.org/datalab/voicebox/), and many of the Python tools developed in this project could be adapted to support chat-based interactions through the Model Context Protocol (https://modelcontextprotocol.io/docs/getting-started/intro).
# Quality of Writing
The article is well-written and organised overall, but anything that can be done to enhance clarity both at the micro and macro levels would be worthwhile. The comments below are suggestions as to how this might be best accomplished, but I leave it to the editors to make a final determination so long as the manuscript is tweaked with clarity and readability front and centre. This is particularly important when readers from diverse backgrounds will be interested in what has been produced here.
The article can be difficult to read in places due to the density of acronyms of different technologies and ontologies. Personally I’d lean toward a little more verbosity to aid clarity for readers, but this is largely an editorial decision. For instance, the repeated use of KG instead of knowledge graph is particularly jarring, especially when it occurs in headings (like section 1.3) or the abstract.
The capacity of the authors to build effectively on their own past work, as well as ontological and software foundations developed by others, should be applauded. At times, however, it is a little difficult to unpick 1) what constitutes background work, 2) what constitutes new work presented for the first time here, and 3) how everything slots together. I think this could be readily addressed with some modest changes to section 1.4 (4:28), in the lead up to section 2, either in the text or even with a minimal timeline that spells out the sequence of work.
In the same vein, the Introduction would benefit from a high-level summary (or diagram) of how project components fit together, along the lines that we have at the very end in the Conclusion (section 6).
Section 1.4 should also make some mention of Section 4 (Evaluation).
A few other minor details I noticed:
(1:46) “recognizing genres” is perhaps redundant, since genre is mentioned twice earlier in this sentence. This whole sentence could perhaps be simplified slightly.
(2:2) I don’t think “most-common” should be hyphenated here unless it is being used as a special term, in which case it should be elaborated a bit more fully.
(2:22) Capitalise “music information retrieval”?
(2:42) Hyphenate “19th century”?
(5: 6) “common format known as JAMS to represent these JAMS annotations”. This phrasing is a little confusing.
(5:42) “A pattern has contents”. Perhaps describe pattern_content more fully either here or in the paragraph above.
(9:1) It is a little confusing for readers that Table 2, which lists out the competency questions, appears well above the Evaluation section. Similarly for Listing 1. But presumably this was done to optimise the use of space.
(10:6) “To save space, we present just four CQs, listed in 2”. It seems like more than four competency questions are discussed, and I’m not sure what “listed in 2” describes.
(12:7) Stray dash (-) near start of line
(16:13) “useres” -> “users”
(17:36) The pattern ‘1, 3, 1, 7’ in the caption of Figure 4 doesn’t seem to match the screenshot (which instead has ‘1, 7, 1, 3’). There are one or two other subtle mismatches between the figure and the caption so it is worth tweaking this to ensure it matches exactly or simply summarises the figure.
(19:1) It is somewhat jarring to have a Figure placed after the conclusion of the text. It would be good to move this further up if at all possible.
(19:28) References [5] and [6] appear to be duplicates, and it appears there may be other duplicates ([15]/[16]) so worth reviewing these carefully. There also seem to be some inconsistencies in names that could be tidied up (see the author Penuela across [12] to [16)].
# Data and resources
The associated GitHub repositories are neatly organised, fairly well documented, and pass basic testing. The tabular metadata included in each README file is a helpful addition. The code itself is nicely structured with clear and intuitive class names, docstrings, and methods.
A few points that could be addressed with small tweaks to the repositories:
1) There is some inconsistency across repositories as to whether dependency versions are pinned with specific versions of packages. (https://github.com/polifonia-project/patterns-knowledge-graph/tree/main/... does not pin them, but this one does https://github.com/polifonia-project/folk_ngram_analysis). Perhaps in a research context like this, pinning the dependencies makes most sense, but if you go this route, dependencies should likely be updated to the extent possible without breaking functionality.
2) Again along the lines of reproducibility, the Patterns Knowledge Graph repository could benefit from the inclusion of a Dockerfile, since it brings together both Java and Python dependencies, and a container could better encapsulate these in reproducible way.
3) For the FoNN tools specifically, pickle likely isn’t the best format for a “corpus-level Pandas dataframe” (8:5) if there is an expectation that these tables might be a useful output in their own right. There are potential security considerations around pickle files and other formats would be more broadly interoperable.
These points are perhaps minor in isolation so I leave any adjustments to the discretion of the authors and editors, but collectively they highlight the need to explicitly address the reusability and sustainability of the code artefacts. I suggest adding some text to the article about how these GitHub and any other resources are structured to support future use by researchers and practitioners, and how the software and data are organised to align with FAIR principles. This may involve some modest tweaks to the repositories, which don’t seem to have seen new commits in the last couple of years.
|