Holzmann, Helge

Long-Term Accessibility of Software through Web Archives

Software is widely used and referenced in research and scientific publications. Hence, just like the results published in an article, associated software should be preserved and made long-term accessibly as well. Due to the nature of software and its dynamic aspects, this is rather challenging though. Very commonly, only related materials, such as a software's documentation, parts of its source code or change logs, are freely available. However, these can be very valuable to comprehend or reproduce experiments described in literature. We found that a big portion of this data is provided on the Web. Around 60% of the software webpages we analyzed link to documentation, while another 50% even contain some artifacts of the actual software [1, 2]. Web Archives are a way to preserve this information and allow for long-term accessibility, even if the software and corresponding information change. Although many of those websites are already archived, our study shows that the evolution of a software is not always well captured. Therefore, we are working towards a pro-active approach to archive software on the Web in the future as part of the scientific process. [1] Holzmann, H., Runnwerth, M., Sperber, W.: Linking Mathematical Software in Web Archives. 5th International Congress on Mathematical Software, ICMS 2016. Berlin, Germany (2016). [2] Holzmann, H., Sperber, W., Runnwerth, M.:?Archiving Software Surrogates on the Web for Future Reference.20th International Conference on Theory and Practice of Digital Libraries, TPDL 2016, Hannover, Germany(2016).