As this is your baby, the final decision about where to publish should absolutely be yours but, speaking as an academic, publishing in conference proceedings (which is how I have understood the ICQE22 output to be) is of no value. If I were you, I would write to the editor of ANS and ask whether the proposed article submission would be considered sufficiently different from the potential ICQE22 publication (as per your table above) to constitute an original article. If not, I would prioritise ANS. But, of course, the decision is yours.
In my community, it is common practice to publish first to a conference (with proceedings not in a journal already) then an extended version to a journal if the journal paper contains at least 30% of new materials. It is also important to clearly wrote in the journal paper that it is an extension of a previous conf paper with new materials (to be listed) like we did here : https://doi.org/10.1016/j.visinf.2020.09.005 (look for the paragraph in the introduction starting by: This paper builds on and extends a previous work…)
I have written to Hocine Cherifi on August 19th, but he never got back to me. Do you (and @melancon) think it is OK to submit, adding language to be equivalent to yours below? Or should I ask you guys to reach out to Cherifi?
This paper builds on and extends a previous work oriented on the presentation of the algorithm behind Jasper and its time and space complexity (Vallet et al., 2016). Instead, the contributions of the current paper are: (1) guidelines for designing community-oriented visualizations of social networks; (2) a revised presentation of Jasper which is designed to quickly produce an overview of a social network emphasizing communities and their interconnections.
A controlled user evaluation has been added to assess the performance of Jasper for analyzing the community structure of social networks, and compare it to a matrix visualization and two variations around node-link diagrams. In order to ease future work comparing Jasper, we have voluntarily used general tasks (not specific to any kind of data or domain) for community visual analytics, various and freely available datasets (for reproducibility), widely known and used visualizations allowing for indirect comparison with Jasper, and basic interactions (changing the color) which should be available in all visualization platforms.
Like all academics in France, I am pretty sure Hocine was away and on vacations. He should come back soon as everybody here.
I think it is ok and safe to submit after adding some lines to clearly explain the difference between both submissions. To be honest, is the best thing to do. As a reviewer I hate when authors try to hide a previous paper (immediate rejection in this case).
my experience publishing in viz journals coincides with Bruno’s (is that a surprise), and I also agree with his suggestions to clearly explain in what way(s) the submitted article differs/extends the conference paper.
Hello @icqe22_authors . I have followed the advice of Bruno and Guy, and put together a version of our paper for Applied Network Science. In the introduction, I explain (clearly, I hope) what, in this paper, is new and what is included for legibility.
I need a second pair of eyes, though: I have pushed around so muh text that my brain is fried. @bpinaud, since we are submitting to a netsci journal, could you re-read with fresh eyes? Link (with edit privileges):
Main problem I have here is with releasing the dataset for that paper.
I have a script that exports data as four CSV tables per dataset, but since we write the ICQE22 paper (and the first version of the ANS paper, of course) the POREBEL corpus shifted and grew. In particular, the corpus associated to the #ethno-rebelpop-polska-interviews tag is much larger than before (58 interviews instead of 19). However, I am reluctant to re-do the analysis – the project is ending, and we do not have much time.
Possibilities are:
release the data in the form of the Tulip file I used to create the visualizations in the paper.
redo the analysis, then release the data in CSV form (cleaner, longer).
I am not sure I fully understand the issue (why and where do we have to release the data?). I suppose this is about a VERSION of the data that was used for the first version of the paper. If so, I second Nica’s idea. After all, this is not our presentation of results but rather our technique.
Most journals nowadays do demand you publish your data. In the case of Applied Network Science:
All manuscripts must include an ‘Availability of data and materials’ statement. Data availability statements should include information on where data supporting the results reported in the article can be found including, where applicable, hyperlinks to publicly archived datasets analysed or generated during the study.
(source)
If our argument holds on the basis of the analysis of the smaller dataset, I see no reason to redo the analysis with the larger one. When uploading the dataset to an online appendix or repository, we could say that this is the data on which our analysis is based and we have subsequently expanded the dataset with additional ethnographic material - but I’m not sure we even need to do that.
I think the easiest think to do is to release the dataset used for the paper. Without this dataset, it will be not possible to redo the computation. Tulip is a free software. So releasing the tlpb / tlp (not the tlpx) file is enough.
The update of the dataset could be also published later (when the paper will be published) as a new and enriched version.
CSV are tables, not networks. Indeed, I can easily generate in CSV format all the data needed to build the network, but then you have to write code to go from the CSVs to the network.
The format in which we publish our data is this: https://doi.org/10.5281/zenodo.5575836. These are POPREBEL data, not up-to-date, so do not use them for analysis, but just to familiarize yourself with the structure. The structure stays the same.
In particular, I need data on codes networks for the Polish interviews that show relations between codes, to be precise - basically, what we can see through RyderEx, but in a format I can read through other applications. How would I get it?
Everything is in the CSV files. But you need to write code that will look, for each code, to the posts that it is used to annotate it, and then for each of that post, which other codes are also used to annotate it. This will give you the network of codes co-occurrence. All entities have unique IDs, that you will find in the CSVs. The meaning of fields in the CSV files are explained in the metadata, contained in the datapackage.json file.