kumulative dissertation copyright

Thesis by publication (cumulative dissertation)

Copyright notices prior to publication.

If your dissertation contains complete published or submitted papers, there are several copyright issues to consider.

1. Checking the legal conditions

Please check the publisher  conditions  regarding the reuse of articles in your dissertation. The provisions of your publishing contract are authoritative. If no specific provisions are expressed in the contract, the following policies apply: Publishing Policies Dissertations . Your publisher is not listed? Please ask the editor/publisher.

Did you receive no answer or you are unsure about the publisher's specifications? Please contact the  Open Access Team . 

2. Phrase embedding

In case of publication reuse a common demand of publishers is that a fixed phrase is included in the publication. Please always place them at the beginning of the corresponding chapter. 

3. Agreement of the co-authors

Please ask your possible co-authors for their  consent  to the self-archiving (e.g. in writing by email). The dissertation office does not require proof.

4. Overview of included publications

Please attache a  summary overview  to your dissertation, listing all publications that have been fully included in your work. The following information should be included:

  • Complete bibliographic data
  • Version information (preprint, accepted manuscript, publisher's version) (see:  glo ssary of publishing policies dissertations )
  • DOI (if available) actively linked, starting with https://doi.org/...

Ways to publish

For cumulative dissertations, you can choose one of the following ways of publication:

  • Online-Publishing of cumulative dissertations on DepositOnce
  • 15 print copies in dissertation print

Implementation rules

Before the scientific discussion, the implementation rules of the individual faculties are binding.

Implementation rules by the faculties

  • Faculty I - Humanities and Educational Sciences
  • Institut of Chemistry  (pdf, 62 kB)
  • Institute of Physics  (pdf, 64 kB)
  • Institut of  Mathematics
  • Faculty III – Process Sciences
  • Faculty IV – Electrical Engineering and Computer Science  (pdf, 108 kB)
  • Faculty V – Mechanical Engineering and Transport Systems
  • Faculty VI – Planning Building Environment
  • Implementation regulations (2014)  (pdf, 12.62 kB)
  • Implementation regulations (2021)  (pdf, 138 kB)
  • Creative Commons licenses
  • legal information on publishing
  • Publishing Policies Dissertations

Dissertation Service of the University Library

[email protected]

Privacy notice: The TU Berlin offers a chat information service. If you enable it, your IP address and chat messages will be transmitted to external EU servers. more information

The chat is currently unavailable.

Please use our alternative contact options.

Thesis and Dissertation Guide

  • « Thesis & Dissertation Resources
  • The Graduate School Home
  • Introduction
  • Copyright Page
  • Dedication, Acknowledgements, Preface (optional)
  • Table of Contents
  • List of Tables, Figures, and Illustrations
  • List of Abbreviations
  • List of Symbols
  • Non-Traditional Formats
  • Font Type and Size
  • Spacing and Indentation
  • Tables, Figures, and Illustrations
  • Formatting Previously Published Work
  • Internet Distribution
  • Open Access

Registering Copyright

Using copyrighted materials.

  • Use of Your Own Previously Published Materials
  • Submission Steps
  • Submission Checklist
  • Sample Pages

IV. Copyrighting

A copyright is an intangible right granted to the author or originator of certain literary or artistic productions, under which they are invested for a limited period with the sole, exclusive privilege of making copies and publishing and selling them.

Copyright protection automatically exists from the time the work is created in fixed form. There is no requirement that the work be published or registered to obtain protection under copyright law. The copyright of any work immediately becomes the property of the author who created the work, unless it is a work-for-hire, or unless ownership has been assigned by written agreement.

Receipt of a submitted and approved thesis or dissertation in The Graduate School results in the publication of the document by the University Library at UNC-Chapel Hill. As such, each student grants the University a limited, non-exclusive, royalty-free license to reproduce the student's work, in whole or in part, in electronic form to be posted in the University Library database and made available to the general public at no charge. This does not mean that UNC-Chapel Hill owns the copyright to your work (you do), but the University has the right to reproduce and distribute your work. Public universities often require students to allow reproduction and distribution of academic work to support the dissemination of intellectual thought and discovery. Please review the Copyright Policy of the University of North Carolina at Chapel Hill for additional information.

Regardless of whether or not you register copyright for your thesis or dissertation, UNC-Chapel Hill requires that you include a copyright notice following the title page. See Section I of this Guide and the sample copyright page for the format of this notice. Including this page helps to establish that you are the owner of the work. It also protects you, as the copyright holder, from anyone claiming innocent infringement or unintentional violation of copyright.

You may wish to register your copyright with the U.S. Copyright Office at the Library of Congress. As mentioned above, copyright registration is not a condition to copyright protection. There are, however, advantages to registration, especially if you have a claim of infringement of your copyright. Registration may be made at any time within the life of the copyright, but there are advantages to filing for registration within three months of publication. For more information on registration, consult the website of the U.S. Copyright Office .

There are two main ways for you to file for copyright of your thesis or dissertation:

  • You may empower ProQuest to file the application on your behalf. When you submit your thesis or dissertation, ProQuest charges a fee for this service ($55, subject to change). The service includes preparing an application in your name, submitting your application fee, depositing the required copy or copies of the manuscript, and mailing you the completed certificate of registration from the Library of Congress.
  • Alternately, you may file for copyright directly. Visit the following U.S. Copyright website for more information about registering your work . There is a copyright fee for filing copyright directly with the U.S. Copyright Office ($35, subject to change).

Any copyrighted materials used in your work, beyond brief excerpts, may be used only with the written permission of the copyright owner. Book and journal publishers normally hold the copyright for all materials they publish. Therefore, even if you are the sole or one of several authors of material in a published book or journal, you must obtain written permission from the copyright holder if you are including this material in your document. Remember that use of reproductions or excerpts of other media, such as music, graphic images, or computer software may also require permissions.

Your letter to the copyright holder needs to make clear that you seek written permission to preserve (on microfilm and digitally) and publish (in print and digital form) your thesis or dissertation through ProQuest and that ProQuest may sell, on demand, for scholarly purposes, single copies of your work, which includes the copyright holder's material. Your letter must also seek written permission for the document to be submitted in electronic format to UNC-Chapel Hill where it will be placed in a database and made available through the University Library to the general public at no charge via the Internet.

You are responsible for securing all necessary permissions and paying any permission fees in advance of using copyrighted materials in your work.

Use of Your Own Previously Published Material

Some academic programs permit you to include articles or other materials that you have previously published, that have been accepted (or submitted, in press, or under review) for publication, or that have been otherwise presented to the public within the body of your thesis or dissertation. In all such instances the following guidelines apply:

  • If the material is co-authored, your academic program must approve its inclusion in your thesis or dissertation.
  • If the material is copyrighted (if you are the sole author but the copyright is held by the publisher), you must fulfill the conditions specified in the section above on using copyrighted materials .
  • The material, if included in the body of your text, must conform to all formatting guidelines outlined in this Guide. See the Formatting Previously Published Work section for details.

Previous: Format

Next: Submission

University Library, University of Illinois at Urbana-Champaign

University of Illinois Library Wordmark

Copyright for Graduate Students: Theses and Dissertations

  • Public Domain
  • Creative Commons Licensing
  • Obtaining Copyright Permission
  • International Materials
  • State and Federal Governmental Materials
  • University Policies on Copyright
  • Depositing Your Dissertation/Thesis in IDEALS

Copyright Questions?

Copyright law can be difficult and confusing. This webpage is meant to provide you with guidance, but not legal advice.

Should you have further questions, please do not hesitate to ask Sara Benson, the Copyright Librarian, for assistance. Sara can be reached at 217-333-4200 or [email protected]

Scholarly Communication and Publishing

Profile Photo

Some Copyright Ground Rules

  • General Concepts
  • What Does Copyright Protect?
  • A work created today (or, more specifically, after 1989) is protected under copyright  as soon as it’s created  and is (generally) protected for the lifetime of the creator, plus 70 years (could be even longer for some works).
  • There is no special symbol  (such as the copyright symbol) necessary on the protected work since 1989--it is protected simply because someone created it and wrote it down or recorded it.
  • If more than one person created a work, they might be  joint owners of a work  (see "Copyright Ownership" on the right).
  • When copyright expires, the work becomes  public domain .
  • Ideas can’t be copyrighted, only the tangible expression in a fixed medium of the idea can.  
  • Facts can't be copyrighted, either.
  • You may use any copyrighted material under the  “ fair use ”  doctrine, within fair use guidelines.
  • If something  looks copyrighted, assume it is.
  • Copyright protects an author's right to reproduce (copy), distribute (license), make derivatives of the work, publicly display and perform the work
  • This means that if you wish to make a copy of a copyrighted work (unless it is considered a  " fair use ")  you must  get permission  from the owner of the work
  • You also generally cannot publicly display a copyrighted work (say a movie or work of art) unless you have permission to do so or a recognized  "copyright exception"  exists

Copyright Ownership

  • The Author Is The Initial Owner
  • Ownership Can Be Assigned or Transferred
  • Works Can Be Made Available Under Terms More Favorable Than Copyright Allows
  • Joint Ownership of A Copyrighted Work

If you wrote an essay or article, you are the owner of that article unless and until you contract away your rights (such as in a publishing agreement).

Giving away the bundle of rights that constitute copyright is often called a grant. If the transfer is exclusive it has to be in writing. In books/articles, this usually occurs in a publishing agreement.

The Creative Commons has developed a series of licenses that allows copyright holders to retain control over their works, but still make them available under terms more favorable than copyright allows.  Essentially, under the creative commons licenses, owners of copyright have allowed others to use their work with certain limitations specified in the creative commons license.

More information about the  creative commons  license is available on their website at  www.creativecommons.org .

 A work is considered joint if it meets these conditions:

both or all the authors intend that their contributions be merged into a single work;

this intention exists at the time of creation of the work.

No written contract is necessary to create a joint work. Each author owns an undivided portion of the entire work.  So, one author can grant another person permission to use the work without the agreement of the author author.  The only obligation is to share in any profits received. 

For some additional information about copyright in the music industry, please see the LibGuide on  Copyright Resources for Music .

The Basics of Copyright

Common Questions & Answers

Q: Should I put some sort of copyright notice on my work?

A:  It is wise to do so because even though it is not required, many people misunderstand basic copyright law rules. So, putting a notice on your work will remind others not to use it unless they have an exception applies to general copyright rules or they have obtained your permission first.  

Q:  As long as something is for educational use, I'm not violating copyright laws, right?

A: Unfortunately, no. Although there is a limited exception for face-to-face teaching, not all educational uses of copyrighted works will fall under that exception and fair use is decided on a case-by-case (not a blanket exception) basis.

Q:  How do I know FOR SURE that something is a fair use?

A:  That's a tough one. Unfortunately, it is hard to know when something is a fair use for sure because, ultimately, the court decides fair use cases on a case-by-case basis. Generally, we should exercise our good faith judgment and consider risk assessment when making fair use determinations.  But, this does not mean that we shouldn't exercise our fair use rights.  We should do so in a considered way.

Q: What role does licensing play in specific copyright questions?

A: A very large role. Essentially, you can contract away (through licensing) any of your copyright rights. So, for instance, if I write a journal article but if I assign my copyright entirely to the journal publisher, then I no longer have any right to share my article either publicly or privately without the permission of the journal.

The content for this page originated with the School of Music's Copyright LibGuide .  

Except where otherwise indicated, original content in this guide is licensed under a   Creative Commons Attribution (CC BY) 4.0 license . You are free to share, adopt, or adapt the materials. We encourage broad adoption of these materials for teaching and other professional development purposes, and invite you to customize them for your own needs.

  • << Previous: Home
  • Next: The Law >>
  • Last Updated: Mar 21, 2023 12:58 PM
  • URL: https://guides.library.illinois.edu/copyrightforgradstudents
  • Dissertation Copyright
  • Dissertation Embargo Guidelines
  • Dissertation Templates
  • ETD Administrator
  • Formatting FAQs
  • Sample Dissertation Title Page

Copyrighting your Dissertation

In the United States, you automatically own the copyright in your original creative authorship, such as your dissertation, once it is fixed in a tangible form ( i.e. , written down or recorded). United States law does not require you to include a copyright notice on your dissertation or to  formally register  with the U.S. Copyright Office in order to secure copyright protection over your work. However, there are some benefits to including a copyright notice and registering your work. See the  Copyright Guide  for more information or to schedule a consultation.

Including a Copyright Page in your Dissertation

Including a copyright page in your dissertation is optional but recommended. For details on how to format the copyright page, consult the  PhD Dissertation Formatting Guide  and the  PhD Dissertation Formatting Checklist .

Using Your Own Previously Published Material in Your Dissertation

University of Pennsylvania  policy  allows you to include your own previously published work or articles submitted for publication as part of the dissertation with the following conditions:

  • You must obtain approval of the dissertation committee and Graduate Group Chairperson.
  • You must obtain written permission from the copyright owner, which may be the journal, publisher, and/or any co-authors, unless you are the sole copyright holder (depends on your publishing agreement).
  • You must upload any permission letters in ETD Administrator as an  Administrative Document  titled “Permission Letter – Do Not Publish.”
  • For dissertations based on joint work with other researchers, a unique and separate dissertation must be presented by each degree candidate. You must include a concise account of your unique contribution to the joint work, and remainder of the dissertation must be authored solely by you. Authorship of an entire dissertation by more than one degree candidate is not allowed.
  • Your dissertation must be formatted as a single document with consistent formatting and styles throughout. If you are using multiple previously published articles, make sure to make the formatting consistent with the rest of the document.

When using previously published or in press work, you must disclose this information in your dissertation in the following format :

  • Under the Chapter title, list the full citation for the previously published/in-press article in the citation style used in your Bibliography.
  • If it is a jointly authored article, describe your contribution to the work in a separate sentence.

Example of Dissertation Formatting

Using Other Copyrighted Material in Your Dissertation

If you use third party copyrighted material (images, quotations, datasets, figures), you are responsible for re-use of that material (see the  Policy on Unauthorized Copying of Copyrighted Media ). In many cases, you may be able to use copyrighted material under the “ fair use ” provision of U.S. copyright law. Consult the  PhD Dissertation Formatting Guide  and the  PhD Dissertation Formatting Checklist  for information on how to submit written permission from a copyright holder. Typically, you will need to request a permission letter and upload the letter as an  Administrative Document  in  ETD Administrator .

If you still have questions regarding copyright and “fair use” refer to the  Penn Libraries Copyright Guide  or email  [email protected]  for further support.

Patent and Intellectual Property

Any inventions that you make as part of your research for your degree and disclosed as part of your dissertation, and any patent or other intellectual property rights arising therefrom, are governed by the policies of the University of Pennsylvania, including the  Patent and Tangible Research Property Policies and Procedures  and  Policy Relating to Copyrights and Commitment of Effort for Faculty.  For more information, please contact the  Penn Center for Innovation .

There are strict deadlines under U.S. and international law regarding the timing for filing patent applications and the public availability of your dissertation. Contact the  Penn Center for Innovation  to discuss whether there might be a patentable invention disclosed in your dissertation prior to deposit of your dissertation.

Frequently Asked Questions

Do i have copyright over my dissertation .

Yes. According to US Copyright law, you have copyright immediately and automatically over any of your new, original works in a “fixed, tangible form” ( i.e. , written down, recorded, etc.). You do not need to register or to include a copyright symbol © or any other formal marks to secure your copyright, though there are some benefits to doing so. See the  Copyright Guide  for more information or email  [email protected]  for further support.

Should I register the copyright in my dissertation with the U.S. Copyright Office? 

It depends on what you want to do with your dissertation. There are  some benefits to registering the copyright  in your dissertation depending on your future goals. However, keep in mind that you automatically have copyright over your dissertation without formally registering. To learn more about formally registering the copyright in your dissertation, see the  Copyright Guide  or schedule a consultation.  

Should I pay ProQuest to register my copyright?

Note that you already have copyright over your dissertation, but if you would like to  formally register your copyright with the U.S. Copyright Office , you can pay ProQuest to do it for you (you will have the option in ETD Administrator). For less cost, you can register it yourself on the  copyright.gov  web page. Information on registering your copyright is available in the  Copyright Guide . Please keep in mind that if portions of your dissertation are comprised of previously published co-authored material,  you cannot  register your copyright through ProQuest. 

What is a Creative Commons license?

A copyright license grants permission for someone else to use your copyrighted work.  A  Creative Commons  license is one type of copyright license. It works hand in hand with your copyright. It is not an independent type of copyright. By using a Creative Commons license you are telling the world under what circumstances they are able to use your work without asking your permission each and every time.  You can only add a Creative Commons license to your work if you are the copyright holder, and have not transferred your rights to someone else (like a publisher).

You may choose to apply a Creative Commons license to your dissertation by adding it to the copyright notice page; see the  PhD Dissertation Formatting Guide  for an example. V isit the  Creative Commons website  to review all the licenses in full detail and select one that fits your needs. 

Refer to the  Services for Authors Guide  or  schedule a consultation  to learn more about using a Creative Commons license on your dissertation.

I want to use copyrighted materials in my dissertation. Is that okay?

It depends. If the materials you wish to incorporate into your dissertation are copyrighted, you will need to do a  fair use analysis  for each item you use to determine if you can proceed without getting permission. If you do not feel that you can make a good “fair use” case, you will need to  request permission  from the copyright holder and provide all permission letters as  Administrative Documents  in ETD Administrator. Just because you are using the work for educational purposes does not automatically mean that your work is “fair use” or that you have permission to use the work.  Request a consultation  to learn more about fair use and other copyright considerations.

I want to use my own previously published materials in my dissertation. Is that okay?

It depends. If the materials you may wish to incorporate into your dissertation are published in a journal or other publication, you may need to seek permission from the journal, publisher, or any co-authors. These permission letters must be uploaded as supplementary material in ETD Administrator before the deposit date. Please refer to your publication agreement for further information.

Additionally, using previously published materials as part of your dissertation requires approval of the dissertation committee and Graduate Group Chairperson.

I would like to know more about publishing, copyright, open access, and other/related issues. How can I find out more?

The Penn Libraries offers a range of workshops and presentations on these topics (and other digital skills related topics)  throughout the year . Groups can request a number of these workshops for classes or other group settings. For personal discussions about copyright, fair use, Creative Commons, scholarly publishing, and other related topics, please  contact your subject librarian  for support and further referrals. For more general information about these and related topics, review the  Penn Libraries’ guides  by keyword or subject.

Banner

Theses & Dissertations

  • Submitting your Thesis or Dissertation
  • Depositing with ProQuest
  • Understanding Copyright
  • Understanding Embargoes
  • Frequently Asked Questions

Helpful Links

  • Copyright Issues Related to the Publication of Dissertations

Copyright is an important component to publishing your dissertation or thesis. Students should consider copyright as early in their work as possible, especially if you wish to reuse content from another copyright holder, such as images or figures. Here are some details on things that students should consider when reviewing their copyright needs and uses.

For additional information and resources on copyright, please visit the Copyright Guide . 

Determining Copyright Ownership

Under Carnegie Mellon University’s  Intellectual Property Policy , you most likely own the copyright to your dissertation. However, if the research was sponsored by the university or conducted under an agreement between an external sponsor and the university, check the agreement to see who owns the intellectual property. When in doubt, consult Carnegie Mellon’s  Center for Technology Transfer and Enterprise Creation  (CTTEC),  412-268-7393  or  [email protected] .

Neither the University Libraries nor ProQuest/UMI require copyright transfer to publish your dissertation. Both require only the non-exclusive right to reproduce and distribute your work.

Copyright Permissions

According to the  Fair Use Policy of Carnegie Mellon University , all members of the University community must comply with U.S. Copyright Law. When a proposed use of copyrighted material does not fall within the fair use doctrine and is not otherwise permitted by license or exception, written permission from the copyright owner is required to engage in the use.

To avoid publication delays, Carnegie Mellon’s Office of the General Counsel encourages graduate students to get permission from copyright holders as early in the dissertation process as possible. This includes permission to use your own previously published work if you transferred your copyright to the publisher. See  Copyright Issues Related to the Publication of Dissertations  for more information.

If you choose to publish your dissertation with ProQuest/UMI, you must sign an agreement indicating that you have the necessary copyright permissions, and provide UMI with copies of the permission letters. If you choose to publish with Carnegie Mellon University Libraries, you need not provide copies of the permission letters. The assumption is that you have complied with university policy.

Registering Your Copyright

The  Copyright Law of the United States  gives the copyright owner the exclusive right to copy and distribute the work, perform and display it publicly, and create derivative works. Copyright owners do not need to register their work with the U.S. Copyright Office to acquire these rights. However, if you own the copyright to your dissertation and you have a compelling need to acquire additional legal rights, such as the right to file a copyright infringement lawsuit, then you should register your copyright with the U.S. Copyright Office.

You can register your copyright using the U.S. Copyright Office’s  eCO Online System  for a fee of $35. Alternatively, if you choose to publish your work with ProQuest/UMI, UMI will register your copyright for you for a fee of $55. (See page 6 of the  ProQuest Publishing Agreement .)

  • << Previous: Depositing with ProQuest
  • Next: Understanding Embargoes >>
  • Last Updated: Mar 19, 2024 11:21 AM
  • URL: https://guides.library.cmu.edu/etds

Writing a cumulative dissertation

A cumulative dissertation is a collection of articles which have been published in recognised scientific journals or accepted for publication. My PhD dissertation is a cumulative one and in this blog post I describe its structure and things to pay attention to when writing your own.

Identify articles and contributions

Identify the articles that make up your dissertation and identify the main contributions of these articles. A single article can have multiple contributions and a single contribution can be explored in multiple articles.

For example, consider the following articles with their corresponding contributions:

The first three contributions (ingredients) are used in the fourth and final contribution (recipe).

The structure of your dissertation looks as follows (in chapters):

  • Introduction
  • Chapter about contribution 1
  • Chapter about contribution 2

Chapter 1 provides an introduction to the work that is described in chapters 2 and onwards. Chapters 2 until the second to last align with the articles/contributions. The final chapter concludes the dissertation and looks beyond the work conducted during the PhD.

Considering that we have four articles/contributions in our example, its structure looks like this:

  • How to better harvest bananas
  • Discovery of a brown magic powder
  • What do almonds and butter have in common?
  • 3-Ingredient brownies

Note that the titles of the chapters do not necessarily need to be the same as the title of the articles. They can be altered if it better fits the story.

For example

  • Ingredient: bananas
  • Ingredient: cacao powder
  • Ingredient: almond butter
  • Recipe: 3-Ingredient brownies

Introduction chapter

Or "the story of your PhD".

The introduction chapter is similar to an introduction section in an article. This chapter is the most important one as it describes the problems the dissertation tackles, what the contributions are, and how the contributions are related to each other. This last one is really important as it turns everything in a single, coherent story! Consequently, you will spend the most time on this chapter.

The chapter includes the following parts:

  • General introduction to the problem space.
  • More detailed introduction to a part of the problem space.
  • Challenges that are tackled in the dissertation.
  • Background information needed to understand the research questions and hypotheses.
  • Research questions and hypotheses based on the challenges, together with the corresponding contributions.
  • List of publications.

Try to make parts 1 - 3 as understandable as possible for people who are not part of your academic community, such as family, friends, colleagues from other departments, and so on. I highly recommend to start using an example as early as possible and make that example related to something that everybody understands.

Parts 4 and 5 should still be as understandable as possible. Of course, here it's hard to avoid all technicalities because you have to get to the essence of your dissertation at some point. But at least try.

Some of your articles might have a research question. If that is the case: nice, you can reuse them! But check if they need rephrasing to fit your story. If you have articles without a research question, create one that both fits the story and the work described in the article. Ideally, every research question should also have an hypothesis, which might already be in the original article or that is newly created for the dissertation. Note that not every research question needs an hypothesis. In my case I had a more exploratory article without a research question. I created a research question, but not a hypothesis because that did not make sense. This all depends on the article itself. So check your articles and your story, and see what is possible and what is not.

Part 6 contains two lists: one list with the main articles of the dissertation and a list with all publications you (co-)authored during your PhD.

In general, make sure that you have story (not just a bunch of words and paragraphs) that is clear, not too technical, but still positions and outlines the great contributions of your dissertation.

Aligning this with our example results in

  • General introduction to cooking and desserts.
  • More detailed introduction to creating cake-like desserts with a limited amounts of ingredients.
  • Improve the efficiency of bananas in desserts,
  • Find an alternative way to flavor a cake-like dessert,
  • Find an alternative butter that works in desserts, and
  • Create a cake-like dessert that has at most four ingredients.
  • Background information about desserts, recipes, ingredients, and so on.
  • "How to improve the harvesting of bananas?" with as contribution a new method,
  • "How to flavor desserts through the use of powder?" with as contribution cacao powder,
  • "How to improve the consistency of desserts through a non-conventional ingredient?" with as contribution almond butter, and
  • "How to create a cake-like dessert using bananas, cacao powder, and almond butter?" with as contribution: 3-ingredient brownies.

Contribution chapters

Ideally every contribution is contained in a single article and you can put every article in a separate chapter. That way a single research question and hypothesis is aligned with a single chapter. If it reads better (for the story) to put multiple articles in the same chapter, then that is not a problem. Looking at our example we could have something like

  • Ingredients: bananas, cacao power, and almond butter

This is fine. Important to note here is that this all depends on the story you set out in the introduction chapter. So I suggest to create a decent version of that chapter first before moving on to the contribution chapters.

The structure of a contribution chapter is as follows

  • A copy of the original article

The introduction includes how the chapter fits into the story, where in the story the reader is, and what is discussed. Note that the content of the original article can be adjusted, for example, to

  • Rephrase the research questions and hypotheses.
  • Replace words to make them the same across all articles, because you might use different words for the same concept across different articles. It happens to all of us, especially when there are four years between the first and last article.

Conclusion chapter

The conclusion chapter is similar to a conclusion section in an article. It concludes your story and looks at what can be investigated in the future. More specific, its structure is as follows

  • Impact of contributions
  • Remaining challenges and future directions

The first part reflects on the research questions, hypotheses, and corresponding contributions, and discusses how they tackle the challenges mentioned in the introduction chapter. The second part describes remaining challenges based on aforementioned challenges and contributions.

Remaining challenges for our example are

  • How to create similar brownies with different flavors?
  • The production of almond butter needs to be improved if its used in more and more recipes.

Additionally, the second part also discusses future directions:

  • What we can do next to tackle these new challenges.
  • What was already done by you regarding these challenges. This is not needed, but do mention it if have you done something.
  • What your vision is for the future.
  • Can we change the flavor of cacao powder to change the whole flavor of the brownies?
  • Higher quality almond butter can be produced through the use of dedicated fridge as described in my most recent article.
  • In the future more and more recipes will rely on almond butter, both from an economic and flavor point of view.

These are the most important things that I have learned during the writing of my dissertation. Note that these are mere suggestions that might or might not work for your dissertation, so do not hesitate to deviate from them if you feel the need to. It is your story after all 😉

If you have any questions or remarks, don’t hesitate to contact me via email or via Twitter .

Elephant in the Lab

Mennathulla Hendawy

How to structure a cumulative dissertation: Five strategies

1 June 2021 | doi:10.5281/zenodo.4786446 | No Comments

How to structure a cumulative dissertation: Five strategies

In this article, Mennatullah Hendawy shares some insights on structuring cumulative dissertations based on her own experience

“The whole is other than the sum of its parts” ~ Aristotle~

In general, there are two styles of doctoral dissertations: monographs (thesis as a book), and cumulative thesis (thesis by publications/papers). In this article, I will share some insights with regard to cumulative dissertations based on my own experience. A cumulative dissertation consists of a series of papers published, or submitted for publication during the timeframe of the doctorate study. In addition to the papers, the PhD student is required to create an overarching argument that is to be presented in the thesis’s introductory and conclusion chapters. The number of papers to be published and/or submitted is determined by each university. I noticed that, usually, it ranges between 3-6 papers.

kumulative dissertation copyright

Before I start, let me briefly introduce myself: I am a PhD candidate at the Chair of Urban Design, TU Berlin with Prof. Jörg Stollmann. I started my PhD in May 2017 and recently submitted my cumulative thesis of 5 papers in January 2021. While I am waiting to defend my thesis I am writing this article in an attempt to provide some insights into one of the common challenges of conducting a cumulative dissertation: How to structure the series of papers so that they make sense. The papers have to be connected with a thread (sometimes referred to the thesis’s golden thread, see here and here ) and this thread can be presented as part of the overarching argument of the papers together. While each paper has one or more research questions, all the papers together respond to one central question. I would like to share with you some ideas on how the papers can be combined, creating something bigger than the individual parts. One might assume that it is usually the goal of the first year of a PhD to decide on the structure and scope of the papers which will be the guiding principles for the next phases of the dissertation. Well, in reality this is not always the case. 

Of course the logic of structuring a cumulative thesis depends heavily on the research area and interest. Accordingly, while I share five approaches on structuring cumulative dissertations, I will try to clarify what each approach is suitable for. Each topic can be addressed from different angles, based on the research question, objective, and preferences of the author. 

How my own research interest changed

To proceed from here, let’s take a simple derivative of my thesis topic as the basis for experimenting with the different strategies explained in the following. I wrote my thesis about “The Digitalization of Urban planning’. Over the years, my overarching research question became: In the mediatized world, how and why do planning visualizations become a question of social and spatial justice? I started the dissertation with a clear interest in exploring the entanglement of urban studies and media studies in relation to issues of justice in cities, but the final overarching research question only became clear towards the end of the thesis. This is because I was following an explorative and grounded research approach. Looking back now, I must admit that the earlier the overarching research question is clear the easier the research process. Nonetheless, it is also important to stay flexible throughout the thesis process and let it shape the overarching thread. A middle ground would be a good option!

The strategies

By the time I realized all this, I used more than one strategy to combine and look at the papers. In the following section, some of these strategies are presented in addition to other ideas. This list is surely not exclusive.

Structuring the papers in a cumulative dissertation by field would make each paper concentrate on one field or context of the topic, where it is   practiced. Speaking about my research, the different fields could be the digitalization of planning in planning education, the digitalization of planning in planning practice, politics, culture, context, theory, or research. Following this strategy, each paper would cover one of these fields. This strategy is useful for a thesis that involves an analysis of perception or disciplines and interdisciplinary analysis.

Structuring the papers by the actors or the participants involved in the research would allow each paper to tackle who is involved in the topic and whose visions are to be explored. Taking the example of my topic of planning digitalization, the papers would focus on the views of planners, policymakers, the general public, and computer scientists. The choice of which actors to highlight in the papers will mainly depend on the overarching research question and objective. This strategy is useful for a thesis that involves an analysis of reviews.

By choosing to structure the arguments chronologically, each paper naturally tackles the when in the overall topic. In my case the papers would focus on the printing age, the computer age, and the information age. Another example could be to focus on the different stages of the process of digitalization of planning in each paper. The choice of these processes or temporal milestones reflects on how a certain phenomena has changed throughout history and time. This strategy is useful for a thesis that involves a historical analysis.

In this strategy, each paper takes a case study related to the chosen research topic. Speaking of my research, papers can focus on extreme cases that manifest the research topic, or similar cases that highlight a phenomenon (for more information on the types of cases, Flyvbjerg 2013 is a valuable resource https://arxiv.org/pdf/1304.1186.pdf ). The choice of the cases will mainly depend on the adopted methodology. This strategy is useful for a thesis that involves a comparative analysis of multiple case studies.

By location

In this case, each paper would study a specific location, the “where” in the topic. In my case the papers would focus on  the digitalization of planning in different cities, or in different countries, or even highlight different parts within a city (such as formal versus informal, or rural versus urban). The choice of the location will mainly depend on the overarching research question and objective. This strategy is useful for a thesis that involves a geopolitical analysis.

Using more than one?

In my thesis, I combined strategy 1 (field), 2 (by actors), and 4 (by case). I started out using field and actors, but as I reached the end of the thesis and was looking backward to finalize my overarching argument, I realized that my papers also differed in terms of cases. In fact, one can see these different strategies as a decision on which variables to highlight and which aspects to fix.

At the beginning of the thesis journey, I was interested in writing papers in a way which  presented different fields of action and the views of different actors in each field. Thus, I proceeded to look at the mediatization of urban planning in five fields: planning education, planning practice, planning politics, planning context, and planning culture (which I referred to later as communicative situations). In each paper and field, I highlighted specific actors involved in the process (for example, planning educators and students in planning education). Later I realized that, additionally, each paper reflected the use of a specific planning visualization: education curriculum in planning education, street billboards in planning practice, press news in planning politics, city streets in planning contexts and TV advertisements in planning culture. My overarching concern was to explore the question of ‘Visible urban visions versus the invisible urban challenges’ 

I hope these strategies can be a starting point for those who have chosen the cumulative study for their thesis. Last but not least, I would like to mention that this list is not exclusive and so it will be great to open up the discussion on the other strategies. For any questions, one-one discussions, or insights, feel free to reach out to me via LinkedIn . 

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

First we need to confirm, that you're human. × 7 = forty nine

Author info

Mennatullah Hendawy is a PhD Candidate at the Chair of Urban Design, TU Berlin where she is also co-leading two research groups: Connecting Urbanity and Towards Equitable Planning Curricula. Mennatulah is also a visiting researcher at the Leibniz Institute for Research on Society and Space in Erkner, Germany and an affiliated Assistant Lecturer at the Department of Urban Planning and Design in Ain Shams University in Cairo, Egypt. Hendawy is co-founder of  Cairo Urban AI  and  First Degree Citizens ; an initiative that tackles critical socio-legal geography. She works on the intersection of urban planning, mediatisation, visualisation, and justice where she is fascinated by the way knowledge, power, and agency are manifested in and co-construct cities and the public sphere.

Advertisement

Issue Cover

  • Previous Article
  • Next Article

1. INTRODUCTION

2. literature review, 6. discussion, acknowledgments, funding information, competing interests, data availability, identifying constitutive articles of cumulative dissertation theses by bilingual text similarity. evaluation of similarity methods on a new short text task.

ORCID logo

Handling Editor: Vincent Larivière

  • Funder(s):  Bundesministerium für Bildung und Forschung
  • Award Id(s): 01PQ16004 , 01PQ17001
  • Cite Icon Cite
  • Open the PDF for in another window
  • Permissions
  • Article contents
  • Figures & tables
  • Supplementary Data
  • Peer Review
  • Search Site

Paul Donner; Identifying constitutive articles of cumulative dissertation theses by bilingual text similarity. Evaluation of similarity methods on a new short text task. Quantitative Science Studies 2021; 2 (3): 1071–1091. doi: https://doi.org/10.1162/qss_a_00152

Download citation file:

  • Ris (Zotero)
  • Reference Manager

Cumulative dissertations are doctoral theses comprised of multiple published articles. For studies of publication activity and citation impact of early career researchers, it is important to identify these articles and link them to their associated theses. Using a new benchmark data set, this paper reports on experiments of measuring the bilingual textual similarity between, on the one hand, titles and keywords of doctoral theses, and, on the other hand, articles’ titles and abstracts. The tested methods are cosine similarity and L 1 distance in the Vector Space Model (VSM) as baselines, the language-indifferent methods Latent Semantic Analysis (LSA) and trigram similarity, and the language-aware methods fastText and Random Indexing (RI). LSA and RI, two supervised methods, were trained on a purposively collected bilingual scientific parallel text corpus. The results show that the VSM baselines and the RI method perform best but that the VSM method is unsuitable for cross-language similarity due to its inherent monolingual bias.

1.1. Background and Motivation

What is the contribution of early career researchers (ECRs) to a country’s research output? This question is currently of high science-political interest in Germany and of similarly high practical difficulty to answer ( Consortium for the National Report on Junior Scholars, 2017 , p. 19). The training of qualified research workers is widely regarded as a core mission of universities and accordingly the performance of universities and departments with respect to the training of ECRs plays a prominent role in research evaluation systems. Yet, despite the acknowledged interest in performance of ECRs in terms of scientific output—publications and their citations—this facet of performance, research output, has so far not become part of university evaluation systems or national scale monitoring instruments. Beyond these science-political considerations, the research contribution and performance of ECRs is intrinsically interesting. Comprehensive performance data would enable longitudinal observation and trend detection, comparisons between ECRs of different disciplines, and perhaps the detection of effects of political interventions or different institutional conditions across legislatures (federal states) and organizations (universities) pertaining to ECR performance.

PhD theses are the primary published research output of a completed PhD degree in Germany because the full thesis needs to be published in some format for a degree to be conferred. The publication format may be a regular book with a scientific publishing house or a digital document deposited at a university repository. The regulations vary and are locally determined at the university or department level. Many doctoral students also publish in periodicals and contribute to conference proceedings and edited book chapters. These articles by one PhD candidate might be collated, supplemented with introduction and conclusion material, and submitted as a cumulative PhD thesis, which still needs to be published as a unit. The other class of theses is monograph theses, which have been designed as one single work from the outset and which are not published in parts otherwise.

The importance of cumulative (i.e., article-based) dissertations in Germany can be seen from a number of recent surveys among PhD students and graduates. A 2014 survey of PhD graduates found that 14.5% completed a cumulative thesis while 84% handed in a monograph thesis ( Brandt, Briedis et al., 2020 ). This survey also inquired about the number of cumulative articles. The mean number was 4.0 and the median 3. An analysis by the German Federal Statistical Office found that in 2015 the share of PhD students working on a cumulative dissertation was 23% while 77% worked on monograph theses ( Hähnel & Schmiedel, 2016 ). For seven science domains, the figures for cumulative dissertations varied between 13% in language and cultural studies and 60% in agricultural and food sciences. A 2018 survey asked PhD students on the planned format of their thesis: 25% planned a cumulative thesis, 57% a monograph thesis, and the bulk of the remainder were undecided ( Adrian, Ambrasat et al., 2020 ). In summary, while the monograph dissertation remains the more common format, the importance of cumulative dissertations is substantial and increasing. Therefore, both theses and constitutive articles need to be taken into account in studies of knowledge production and the citation impact of ECRs.

Our study aims to partially assemble the technical requirements for bibliometric studies of the output of doctoral degree holders, which so far are lacking. In particular, we evaluate several methods of short text similarity calculation on their ability to support the identification of the elemental articles of cumulative PhD theses. This is only one component of a complete PhD candidate article identification system, as will be discussed below, but a centrally important one. It is necessary to identify thesis-related articles in the first place because, in the case of Germany, there is no public register of PhD students or graduates, nor is there a comprehensive source of persistent identifiers of PhD students or graduates and their nonthesis articles 1 . With a central register containing PhD candidate names and their university affiliations, it would be possible to do comprehensive and targeted searches of publication databases. Yet, without due caution the results would probably contain inaccuracies. As different persons can have the same name and a single person can publish with different names, an author name disambiguation system is a general requirement for high-quality author-level data. Not all publication databases have such systems, and some vendors do not report on their matching quality. Even if there were a perfect author name disambiguation system, one would need a reliable automatic system for identifying the correct author record among candidates (identity matching) because of name homonymy (different persons with the same name) 2 . Another reason why information on names and university affiliations is insufficient for finding thesis-associated articles is that far from all doctoral candidates are formally affiliated with the universities at which they obtain their degree (e.g., Gerhardt, Briede, & Mues, 2005 ).

It is possible to bypass the author identity problem by simply considering the author names of a thesis and of candidate-associated articles as only one feature of a larger, jointly used feature set. In other words, instead of matching articles via their disambiguated authors, one can match theses and articles via author names (rather than author identities) and a suite of other features, such as information on affiliation, publication year, and topic. This is the approach pursued in the project of which this study is a part 3 . We anticipate handling the matching of thesis records and candidate article records by supervised classification algorithms. As a recent example of this approach, Heinisch, Koenig, and Otto (2020) , in a thematically closely related study, perform machine learning-based record linkage on bibliographic data on German doctorate recipients with administrative labor market data to trace their career outcomes.

An important feature for matching cumulative theses and their constitutive articles is their topical similarity ( Echeverria, Stuart, & Blanke, 2015 ; Zamudio Igami, Bressiani, & Mugnaini, 2014 ) and we investigate in this paper the optimal computation of topical similarity under the specific conditions of the task at hand. In other words, due to the complexity of the subtask of finding a good topic similarity measure for this specific application, in this contribution we cannot address the entire PhD candidate article identification system, but focus on this important subtask. It is important to mention this context to preclude the misapprehension that the similarity measures studied will be used in isolation to identify articles constituting cumulative PhD theses. The basic premise is that articles on the same or a similar topic are more likely to be proper parts of a given thesis than topically remote or unrelated articles, even by the same authors. In addition, topical similarity can be useful to distinguish between articles by different authors with the same name or name abbreviation in the aforementioned automatic classification stage. The results obtained in this study could therefore also inform future research in author disambiguation. Topical similarity is most conveniently operationalized as textual similarity. While other operationalizations are also appropriate, such as distance in the citation network, the basic data for such approaches is not directly available in the dissertation bibliographic data at hand.

1.2. Contribution of This Study

The best available dissertation data (national library bibliographic records) only contain the titles and, for a subset, content descriptor terms assigned by catalogers. Dissertation titles can be quite succinct—two examples from the data set introduced later are “On operads” and “Fairer Austausch” (“Fair exchange”). This is less of a problem for the candidate associated articles to be matched because their metadata usually also contains an abstract. Therefore this task is an example of the short text similarity problem, which is an area of intense specialized research at the intersection of information retrieval and natural language processing in recent years ( Kenter & De Rijke, 2015 ). The difficulty in calculating similarities for short texts is that two short texts on the same topic are not very likely to use the same terms (the vocabulary mismatch problem). Methods based on exact lexical matching of terms are thus likely to be inaccurate because of the restricted amount of information.

The textual data are domain specific, as they all are formal reports of scientific research. Scientific text usually contains many specialized technical terms and may use some words with specific meanings other than those in common language use. Therefore, methods and language resources designed for domain-general text, such as news, might not be ideal.

Dissertation theses from Germany are typically written in either German or English, other languages being uncommon. For cumulative theses, the incorporated articles need not be in the same language as the thesis title might indicate. It follows that a text similarity method should be able to measure similarity across different languages.

The present paper reports on a study in applied cross-language information retrieval (CLIR) for the purpose of science studies. No new methods are developed, but existing methods are applied to a novel task. The above presented combination of specific factors of the nature of the task means that we cannot simply rely on prior descriptions of the performance of text similarity methods, as these were evaluated on very different problems. Measuring the textual similarity between doctoral theses and their possible constitutive articles on the basis of bibliographic data in a cross-language setting has to our knowledge not been studied before. It was therefore necessary to collect an appropriate ground truth sample to evaluate the studied methods. Furthermore, we also collected domain-specific translated texts to train the tested supervised methods on appropriate data. As the conventional evaluation metrics cannot be applied because of the cases for which no matches should be retrieved (monograph theses—theses without constitutive articles), the choice of appropriate metrics is discussed in some detail.

While this study is concerned with measuring textual similarity between doctoral theses and associated articles, the task of semantic similarity calculation between short representations of scientific texts is of wide applicability in science studies. The calculation of text similarity in bilingual scenarios is of particular importance to those national science systems where English is not the native language and where much research is published in other languages, for which it is crucial to determine links with the international English research literature. It should therefore be noted that even though we only consider the specific scenario of German and English language publications, the methods studied here can be used for any language combination.

The paper proceeds as follows. In the next section we review the related literature. In Section 3 we describe the data sets used in this study. Data preprocessing and the various tested text similarity methods are treated in Section 4 , followed by the presentation of our results ( Section 5 ) and a discussion of these findings ( Section 6 ).

We focus here, first, on prior research in the paradigm of distributional semantics 4 for CLIR. We make this restriction because of the decisive advantage of these methods, which is that they do not require lexical matches to calculate text similarities. This is crucial for the task of short text similarity calculation, where the probability that two compared texts include the same terms is inherently small, quite independent of their true topical similarity. Second, we also review the application of the selected methods for similarity calculation in the field of scientometrics in general (beyond cross-language information retrieval) to provide a more specific context for the use of these methods in the present study.

This line of research was inaugurated with the Latent Semantic Analysis (LSA) model ( Deerwester, Dumais et al., 1990 ) which was extended for cross-language retrieval by Dumais, Letsche et al. (1997) . LSA applies statistical dimension reduction (singular value decomposition, SVD) to the sparse weighted term-document matrix created from a text corpus to obtain a smaller and dense “semantic” vector space representation in which both terms and documents are located. For all input documents and terms, profiles of factor weights over latent extracted factors are obtained that characterize these entities based on observed term co-occurrences in the data. LSA has found significant use in the field of scientometrics. Landauer, Laham, and Derr (2004) illustrate the use of LSA for visualizing large numbers of scientific documents by applying the method to six annual volumes of full texts of papers from PNAS. This study highlights the possibilities of interactive, user-adjustable displays of documents. The study by Natale, Fiore, and Hofherr (2012) studied scientific knowledge production on aquaculture using LSA and other quantitative publication analysis methods. This is an interesting case study because LSA as a topic identification method was triangulated with topic modeling and cocitation analysis on the same corpus of documents. Article titles and keywords were used as inputs for LSA and the similarity values between words from the semantic space were visualized with multidimensional scaling. Important for scientometric applications is that the LSA method is not restricted to textual term data, which was exploited by Mitesser, Heinz et al. (2008) and Mitesser (2008) , who applied SVD to matrices encoding papers and cited references of volumes of journals to measure the topical diversity of research and its temporal development, assuming topical structure to be implicit in the patterns of cited literature.

Random Indexing (RI) is a direct alternative to LSA with lower computational demands, meaning that it can be applied to much larger corpora. Sahlgren and Karlgren (2005) used the RI approach to CLIR for the task of automatic dictionary acquisition from parallel (translated) texts in two languages. Moen and Marsi (2013) experimentally studied the performance of RI in ad hoc monolingual and cross-language retrieval (German queries, English documents) on standard evaluation data sets. Their variant of the RI method only used a translation dictionary but no aligned translated texts. RI compared unfavorably to the standard VSM and dictionary-based query translation in CLIR 5 . To this day, there is little empirical research on CLIR applications of RI and none on short text similarity, to the best of our knowledge. RI has been introduced into the domain of science studies by Wang and Koopman (2017) , describing the application of the method to a benchmarking data set used for testing different approaches to scientific document clustering for automatic data-driven research topic detection. A critical particuliarity of their method is that each document is represented by features of distinct types, namely topic terms and phrases extracted from titles and abstracts, author names, journal ISSNs, keywords, and citations. For each such entity, a 600-dimension real-valued vector representation is learned by random projection from their co-occurrences in the corpus. Next, a vector representation for every document is calculated as the weighted centroid of the vectors of its constituting entities. Clustering algorithms are then applied, one to the set of semantic document vectors directly, and another one to a network of similarity values of each document to its nearest neighbors, in which similarities are calculated as the cosine of the document vectors. Their implementation of RI is further developed in Koopman, Wang, and Englebienne (2019) with improved entity weighting and entity vector projection giving better representations. This study showed the application of the method to a different task of relevance to scientometrics: automatic labeling of documents with terms from a large controlled vocabulary. A version of RI was benchmarked against competing state-of-the-art word embedding methods, trained on the same data, at predicting known withheld Medical Subject Headings for biomedical papers, in which it achieved good results.

Vulić and Moens (2015) introduced a comprehensive CLIR system using word embeddings of several languages in one common vector space, which they call “shared inter-lingual embedding space.” They point out that in the word embedding retrieval paradigm, monolingual search and cross-lingual search can be integrated into one system in which search within a single language only uses that part of the system relating to one language. Cross-lingual and monolingual search in any of the supported languages are combined seamlessly in multilingual word embedding-based systems, obviating the need for query expansion or result list merging inherent in machine translation-based systems. The study also demonstrated that bilingual embeddings viable for cross-language ad hoc retrieval can be obtained from document-aligned parallel corpora and that finer-grained information, such as sentence or word alignments, is not required. The system of Vulić and Moens (2015) relies on bilingual pseudodocuments, documents formed by merging and shuffling terms from aligned documents in two languages, similar to the method of cross-language LSA ( Dumais et al., 1997 ) and showed very competitive results on standard benchmarking data sets for ad hoc CLIR, outperforming prior state of the art models in their test setting (English and Dutch queries and documents). Their results further show that for constructing the document-level representations from the term vectors in the word embeddings CLIR paradigm, a standard term weighting approach outperforms an unweighted additive approach to composition.

To conclude this overview, interested readers are directed to Ruder, Vulić, and Søgaard (2019) for a comprehensive survey of cross-language word embedding models, categorizing them by the required training data according to type of alignment (word, sentence, document) and the comparability (translated or comparable).

3.1. Bilingual Dissertation-Article Pairs

Because the objective of this study is to compare methods of measuring text similarity between cumulative doctoral theses and the articles they are comprised of, we created a manually curated ground truth data set of doctoral theses accepted at German universities and their associated Scopus-covered articles which we use to evaluate the performance of the methods. We chose the bibliographical and bibliometric database Scopus as a source for article-level bibliographic data, as many thesis-related articles are published in the German language and Scopus covers more German-language literature compared to Web of Science. For the period from 1996 to 2017, Scopus contained around 694,000 German-language article records and Web of Science around 500,000 records.

The German National Library (Deutsche Nationalbibliothek, DNB) catalog currently provides the most comprehensive source of data on German dissertation theses. As part of its legal mandate the DNB collects and catalogs all works published in Germany and universities regularly submit dissertation and habilitation theses as deposit copies to the DNB. The DNB collection mandate extends to theses accepted at German universities but published by foreign publishing houses. Nevertheless, there is no reliable information on the completeness of the DNB thesis collection. The DNB catalog clearly identifies all dissertation theses but may contain multiple versions of one thesis, such as a print version, a digital version, and a commercially published version. These versions of one work are not always linked and need to be de-duplicated for analytical purposes. DNB dissertation data has recently been used several times in scientometric research ( Heinisch & Buenstorf, 2018 ; Heinisch et al., 2020 ) 6 . DNB dissertation data are a viable, if challenging, data source for studies on German PhD theses and there is at present no better large-scale source for German PhD thesis data.

Basic bibliographic data for dissertation theses was therefore obtained from DNB catalog records. Records for all PhD dissertations from the German National Library online catalog were obtained in April 2019 using a search restriction in the university publications field of “diss*”, as recommended by the catalog usage instructions, and publication year range 1996 to 2018. Records were downloaded by subject fields in the CSV format option, except for the subject medicine 7 . In this first step 534,925 records were obtained. In a second step, the author name and work title field were cleaned and the university information extracted and normalized and non-German university records excluded. We also excluded records assigned to medicine as a first subject class, which were downloaded because they were assigned to other classes as well. As the data set often contained more than one version of a particular thesis because different formats and editions were cataloged, these were carefully deduplicated. In this process, as far as possible the records containing the most complete data and describing the temporally earliest version were retained as the primary records. Variant records were also retained separately. This reduced the data set to 361,655 records, only a small part of which is used in this study.

After these cleaning operations, the DNB dissertation data set contains the bibliographic information for probably nearly all German nonmedical PhD theses in the period covered. To construct the ground truth data set, the next steps were to identify cumulative theses in the processed DNB data, to extract the bibliographic information of constitutive articles of the cumulative theses, and to link these article-level records with Scopus records.

For the first of the above steps, the identification of a sample of cumulative dissertations, we proceeded as follows. In general, the DNB records do not indicate the type (cumulative or monographical) of dissertations. However, we found and used a small number of DNB records from the above data set containing the phrase “kumulative Dissertation” in the title of the record. These were mostly from a single university. Our second approach was to use the full-text URLs in the DNB data. For those DNB records that contained such a URL we attempted to download the full-text PDF file, if successful, extracted the plain text, and indexed it for searching. While we were able to download many full-texts, the majority of university repository URLs turned out to be outdated and unreachable. We were able to obtain 36,640 thesis full-texts, which were searched for keywords and phrases indicating a cumulative thesis.

As a third approach, we randomly sampled universities and searched their online publication repositories for dissertations containing keywords or phrases indicating cumulative theses. For promising-looking hits, the thesis full-texts were downloaded for examination.

For all theses identified as possible cumulative dissertations through these methods we obtained the published full-texts via university repositories. We manually searched all downloaded full-text PDFs for explicit statements about articles associated with a cumulative thesis. For articles that are described by the thesis authors as being part of the thesis or that appear as chapters, we extracted the corresponding bibliographic data. We thus used three independent and complementary methods to identify theses which might contain information on constitutive articles.

Next, we manually searched for the identified associated articles from the cumulative theses in a snapshot of the Scopus bibliometric database from spring 2019 and assigned the Scopus item identifier to the extracted article records. Only Scopus items with the document types article, review, conference paper, chapter, or book were retained. We also kept track of all examined theses for which no associated articles were indicated in the full-texts. These are also included in the ground truth data as negative cases. This sample is therefore only a convenience sample and not a statistically representative random sample of a population 8 .

The resulting ground truth data set contains 1,181 doctoral thesis records, of which 771 refer to theses with German titles and 410 to theses with English titles. All thesis records are described by bibliographic information from the DNB. Of these records, 449 were identified as cumulative doctoral theses, but 21 of them did not have any Scopus-covered articles. Of the 428 cumulative theses with Scopus-indexed constituent articles, 218 had German titles and 210 had English titles. A total of 732 theses were identified as standalone theses without any incorporated articles. There were 1,499 pairs of theses and Scopus-contained articles out of 1,946 thesis-article pairs in total. The Scopus coverage of this data set’s thesis-associated articles is approximately 77%. The cross-tabulation of thesis language and article language for the subset of the final data set with Scopus-indexed articles is shown in Table 1 . Note that throughout the remainder of the paper we abbreviate German and English in the tables as “de” and “en,” respectively. It can be seen that among German-language theses there is a preponderance of English-language articles and while most articles of cumulative theses with English titles were also written in English, there are also a few German articles among them. However, it should be kept in mind that in the Scopus data there are always English titles and abstracts, while German titles are present additionally for German-language articles. Thus the cross-language problem is possibly at least partly mitigated by the presence of both German and English text for German articles in Scopus.

Dissertation thesis title language and article language for data set of cumulative dissertations

This test data set consists, for the doctoral theses, of author names, thesis title in either German or English, German language keywords assigned by DNB catalogers (only partial coverage), publication year, and university. Article bibliographic data from Scopus are comprised of author names, title in English (always present) and German (sometimes present), English abstract, publication year, and disambiguated German institution (if present). For the text similarity task of this paper, only the thesis title and keywords and the article titles and abstracts were used. Copyright statements in Scopus abstracts were removed. Author names and publication years were used for article candidate preselection, as described next.

At this stage, the validation data set consists of all true positive and a limited number of true negative pairs of thesis records and article records. Yet our envisioned PhD candidate article identification system in principle must be able to identify the right article records among all records in the Scopus database. As it would be too computationally costly to actually compare each thesis record with all article records, we have created an heuristic candidate article pre-filter method for the Scopus data as part of our larger article identification system. Because this procedure is only of minor importance to this study, its description is deferred to Appendix A1 in the supplemental material. This filtering stage reduces the number of candidates per thesis record to about 1,500 Scopus article records on average.

3.2. Training Data

Two of the methods that we experiment with, LSA and RI, require parallel bilingual training data. This means texts in the two languages of the models that are direct translations. As we are working only with texts in the scientific domain and large, manually translated text corpora are not available for this domain for the English-German language pair, we obtained purpose-specific training data. Bilingual scientific texts were collected from the abstracts of journals that provide both German and English abstracts (50,184 abstracts). A second source of bilingual data is dissertation abstracts, which were obtained from universities’ publication servers. We collected 30,275 abstracts of doctoral theses from 10 German universities. Furthermore, we used research project descriptions of projects funded by three funding organizations, the German DFG 9 , the Swiss SNF (obtained from the P3 database 10 ) and the EU ERC (obtained from the CORDIS database 11 ). We used 21,609 DFG, 685 SNF, and 4,997 ERC project descriptions.

We also included the German-English dictionary from the BEOLINGUS translation service of Technical University Chemnitz, version 1.8 12 , as doing so generally improves retrieval performance compared to parallel text alone ( Xu & Weischedel, 2005 ).

After preprocessing, the bilingual document-level corpus, without the dictionary, had a size of 14.4 million German and 15.8 million English tokens in 108,000 parallel documents. German documents had on average 134 terms, English documents 146 terms. The dictionary contains 190,000 translations. The entire corpus, including the dictionary, contained 800,000 different German and 240,000 different English terms and is therefore large enough for training in cross-lingual information retrieval ( Vulić & Moens, 2015 ; Xu & Weischedel, 2005 ).

4.1. Preprocessing

The text data of the bilingual training data and the test data (dissertation titles + keywords and article titles + abstracts) is processed by removing stopwords ( R package stopwords ; Benoit, Muhr, & Watanabe, 2020 ), tokenizing (including lowercasing) and language-specific stemming ( R package tokenizers ; Mullen, Benoit et al., 2018 ), removing numeric tokens ( R package tm ; Feinerer, Hornik, & Meyer, 2008 ), and discarding tokens of one and two characters length 13 . The stemming uses an interface to the libstemmer library implementation of Martin Porter’s stemming algorithm, which continues to exhibit high performance ( Brychcín & Konopík, 2015 ). Stemming helps in overcoming the vocabulary mismatch problem by reducing related terms to one common stem (see, for example, Tomlinson (2009) for German and English monolingual retrieval) while at the same time it also reduces the size of the vocabulary. This is not universally beneficial for all terms as some unrelated terms can be conflated to the same stem. Stemming thus can improve recall while incurring some loss in precision. For the fastText experiment, the terms are not stemmed, as the fastText word embeddings were built with unstemmed text, but otherwise processed as described.

4.2. Text Similarity Models

A large number of text similarity calculation methods and models have been proposed over the years in the literature. We have chosen three baseline and two state-of-the-art methods based on the following considerations. We employ both simple baselines and advanced methods as we are interested in whether basic methods show sufficiently good performance on this novel task or whether more recently proposed methods rooted in the semantic embedding paradigm can better handle the task. With regard to the specific choice of models, the vector space model (VSM) has been the central paradigmatic approach in the field of information retrieval for decades and is routinely used as a baseline to compare novel methods against. The LSA model is an early but well-studied representative of the semantic embedding family of methods, which was proposed for multilingual retrieval applications. The n -gram similarity method was chosen as it is a conceptually distinct approach from the vector space similarity of all the other considered models and it has shown some promise in multilingual retrieval. For state-of-the-art language-aware semantic embedding methods, we have chosen fastText because of its reportedly good results, wide application, and readily available precomputed vector data, while RI was chosen because it can be trained on custom data with little computational cost and thus serves well for studying the impact of using domain-specific training data to construct task-specific models. As our LSA model is also trained on the same data, we also have an opportunity to compare the performance of these two methods given the same training data.

4.2.1. Baseline models

4.2.1.1. vector space model.

We use the basic VSM ( Salton, Wong, & Yang, 1975 ) as a language-agnostic baseline. In the VSM, documents are represented by weighted term vectors. We apply standard term frequency-inverse document frequency weighting (tf-idf) to mitigate the distorting effects of unequal term occurrence frequencies. Vector representations of any two documents can be compared by several different vector distance or similarity operations and it is not a priori clear which one is best for a specific purpose ( Aggarwal, Hinneburg, & Keim, 2001 ). The conventional choice of similarity measure in the VSM is the cosine similarity and Aggarwal et al. (2001) have shown that for L k norm distance functions with different values of k the choice of k = 1 works well. We therefore experiment with cosine similarity and L 1 distance.

4.2.1.2. LSA model

As a second baseline model we construct a joint pseudobilingual vector space using LSA from the same preprocessed English-German parallel corpus introduced in Section 3.2 . LSA consists of the application of statistical dimension reduction of the document-term matrix to lower dimensionality to obtain a latent semantics vector space in which terms commonly occurring together in a context (here documents) have similar vectors and in which documents sharing many terms have similar vectors in the same vector space ( Deerwester et al., 1990 ). This method is one way to address the vocabulary mismatch problem for short texts. For two texts to be highly similar according to LSA, they need not share any terms; they only need to contain terms that frequently appeared in the same documents in the training data, or more indirectly, they need to contain terms that appeared in documents that contained other terms that frequently co-occurred in the training data. LSA can be applied to multilingual problems by creating combined multilingual pseudodocuments from translated texts ( Dumais et al., 1997 ). This method is not intrinsically multilingual as there is no information contained in the resulting model about which language a term is from. Therefore, terms of identical spelling with different meanings in different languages will inevitably be conflated to one vector representation.

For this experiment, the preprocessing consisted of tokenization, stopword removal, lowercasing and language-specific stemming. New pseudodocuments were created by concatenating the German and English texts of each document in the training data. From these processed document representations, a tf-idf matrix was created with the text2vec R package ( Selivanov, Bickel, & Wang, 2020 ). This resulted in an m × n (297,852 × 923,864) sparse document-term matrix M . Truncated Singular Value Decomposition with the R package RSpectra ( Qiu & Mei, 2019 ) was applied to the matrix to obtain the latent space model with t = 1,000 dimensions in which M ≈ U Σ V *, with U an m × t document by latent factors matrix of left singular values, Σ a t × t matrix with the t largest singular values on the diagonal used for weighting the latent factors and all other (off-diagonal) elements being 0, and V * an n × t term by latent factors matrix of right singular values. In this latent space it is possible to locate all input documents and all input terms. By calculating the position of new documents based on the latent space training terms which are also contained in the new documents as new 1,000-dimensional vectors it is possible to obtain the similarity of any two new documents regardless of language by calculating the cosine between their vector representations. The dimensionality of the latent vector space is a parameter that can be chosen by simply using the first d dimensions of the vector space to find the best performing value. We try parameter values between 100 and 1,000 by increments of 100.

4.2.1.3. Character n- grams

Character n -grams are substrings of n consecutive characters length of larger strings. Segmenting texts into sets of n -grams allows calculating subword-level similarities between texts and is therefore another method that can partially overcome vocabulary mismatch. N -gram similarity has shown good results in several cross-language applications, in particular in related languages (e.g., McNamee & Mayfield, 2004 ; Potthast, Barrón-Cedeño et al., 2011 ). Cross-language retrieval with n -grams might be assumed to work better for scientific text than general text because many technical terms are highly similar or identical across languages, such as names for health disorders, chemical substances, or organisms. We use the trigram ( n = 3) implementation of the PostgreSQL version 12 module pg_trgm 14 . The module’s pg_trgm.similarity() function returns a value between 0 (no common trigrams) and 1 (identical strings up to trigram permutation) based on the number of shared trigrams. The score is the ratio of the intersection and the union of the unique trigram sets of the two strings (Jaccard index). Similarities with this function are calculated on the preprocessed dissertation and article text data, where the English and German parts have been concatenated, to keep the input texts into all methods constant. Note that the preprocessing has already eliminated some of the language-specific elements, such as inflections and function words. Applying n -gram similarity on this already stemmed data can show if there is additional benefit to move further away from the original words used in the texts than the stemming alone does by splitting the stemmed tokens into n -gram subtokens.

4.2.2. Language-aware semantic word embedding models

Word embedding models are vector representations of words (or terms) of fixed dimensionality learned from natural language corpora. Unlike the classic VSM, the dimensionality of the vector space in word embedding models is far smaller than the number of different tokens in a corpus and semantically similar words have similar vectors. LSA is one early example of word embedding models. More than one language’s words can be represented in a single word space. Such multilingual models can be constructed either by learning simultaneously from parallel translated texts or by aligning pre-existing monolingual models using some external translation data or other alignment data. As there are quite a number of such models ( Ruder et al., 2019 ), we chose two methods that were straightforward to use or implement, did not require external resources or code dependencies, and were known to scale well. Note that, in contrast to LSA, the following two methods do incorporate information about the language of terms and are thus properly multilingual, having different vector representations for terms of identical spelling in different languages.

4.2.2.1. FastText aligned multilingual models

FastText is a state-of-the-art word embedding method which achieves good results by learning from n -gram subword strings, rather than surface word forms, and representing words in the result vectors as the sum of their constituent n -grams ( Bojanowski, Grave et al., 2017 ). This enables the method to overcome difficulties arising from word morphology and rare words. The fastText method is derived from the word2vec Skip-gram with Negative Sampling method ( Mikolov, Sutskever et al., 2013 ). We used the pre-computed multilingually aligned models released by the authors of ( Joulin, Bojanowski et al., 2018 ) 15 , which are trained on Wikipedia in different languages and aligned after training across languages to map all terms into a common vector space. Note that while Wikipedia is a domain-general knowledge source, it does include vast amounts of scientific knowledge. As the current version of the official fastText programming library is no longer compatible with these vectors, we computed document representations in the database as the average of the fastText word vectors looking up only the exactly matching terms. That means that we cannot benefit from the ability of the fastText library to return results for out-of-vocabulary words. Documents are compared by summing the vectors of their respective terms with tf-idf weights, normalizing the result vectors, and calculating the cosine similarity of these aggregate document representations, following the basic method of Vulić and Moens (2015) .

4.2.2.2. Random Indexing

RI is an incremental word embedding construction method and a direct alternative to LSA ( Sahlgren, 2005 ). Both use dimensionality reduction techniques to reduce the sparse term-document matrix of a training corpus into a smaller and dense real-valued vector space. In contrast to LSA, in RI the whole term-context matrix, which is usually very large and extremely sparse, is never materialized, the dimension reduction is less computationally demanding, and the model is incremental—it can be updated without a complete recomputation when new data is to be added. RI works by first assigning each document a different static index vector , which are vectors of specified dimensionality of values in {−1, 0, 1} drawn from a specific random distribution ( Li, Hastie, & Church, 2006 ). In multilingual RI, there is also a single index vector for each multilingual document. Next, context vectors for each term are created by scanning through all documents. For each term, the index vectors of contexts (documents) in which the term occurs are summed in a single pass through the corpus. In this step, term context vectors for each language are generated separately. This projects both languages’ words into the same random-vector space ( Sahlgren & Karlgren, 2005 ). Reflective Random Indexing (RRI) is the iterative indexing of contexts (respectively terms) with previously obtained index vectors of terms (respectively contexts) instead of random vectors (i.e., higher order indexing; Cohen, Schvaneveldt, & Widdows, 2010 ). This way the model can also learn indirect associations between terms that never co-occur in any documents but which co-occur with terms that co-occur, similar to Second Order Similarity ( Cribbin, 2011 ; Thijs, Schiebel, & Glänzel, 2013 ). Training was done with simple binary occurrence counting of terms in documents—a term was counted as either present or absent, regardless of frequency within a document. To obtain the similarity of two arbitrary documents, the tf-idf weighted context vectors of their constituent terms are added and normalized and then compared with cosine similarity, following Moen and Marsi (2013) and Vulić and Moens (2015) , just as in the other methods in the vector space paradigm. The dimensionality of the vector space is also a parameter in RI. Unlike in LSA, here the entire indexing process must be worked through for each different dimensionality parameter value. We also test values between 100 and 1,000 in increments of 100.

The RI methods were also implemented in PostgreSQL 12. For convenience, all vectors are L 2 -normalized. We use only second-order context vectors from RRI.

4.3. Remarks

An important distinction of the tested methods needs to be pointed out. The VSM and trigram methods are unsupervised methods—they do not require any training data and work only with the input texts that are to be compared. LSA, fastText, and RI on the other hand are supervised methods. They require training on a text corpus. The fastText vectors we use are the result of training on Wikipedia articles in multiple languages. LSA and RI were trained specifically for this study on the bilingual training data described in Section 3.2 , that is, bilingual scientific texts 16 . In particular, we have chosen to train these methods on whole abstracts (or brief project descriptions) for the domain-specific vocabulary and a dictionary for the domain-general vocabulary rather than only on smaller contexts such as sentences or fixed-size word windows. The reason is that we would like to obtain embeddings primarily optimized for document-level topical similarity rather than word-level similarity. In constrast, fastText uses between one and five surrounding words ( Bojanowski et al., 2017 ).

To conclude the presentation of the models, Table 2 illustrates term similarities for three example terms for three supervised models. These impressions confirm that the models can learn enough from the training data to provide related result terms.

Examples of term similarities for three terms in LSA, RI, and RRI models

Note : The term “gdr” is from the abbreviation for German Democratic Republic, “groundwat” is the stemmed form of groundwater, and “stahl” is from German “Stahl” (steel).

For each tested method, 1,728,816 similarity calculations between thesis record representations and prefiltered candidate articles are computed. In not every case can a similarity value be obtained for the supervised methods, namely when either the thesis or the article texts do not contain any of the terms of the training corpus. This happens rarely and the exact figures will be given in the next section. There are between one and 47,654 similarity calculation per thesis, with an average of about 1,500. Very few of these are true positives and many theses have no true positives.

5.1. Precision and Recall

The evaluation of results for the thesis-article matches data set is not straightforward. The reason is that, for many theses, there are no matching articles, so no matches ought to be found. Such a situation is difficult to evaluate with classic precision and recall methodology as it presuppose true positives for every query. However, we still calculated precision and recall figures to understand the outcomes of this approach despite our reservations. For each evaluated method, up to 1,000 quantile values of the distribution of similarity scores between dissertation text and candidate article text were calculated—fewer if different values actually occurred. The similarity scores at these quantiles were used as threshold values. At each different threshold score, precision and recall were calculated by assigning all document with score greater than or equal to the treshold as positive, those below, negative. We can thus obtain a picture of the possible range of the tradeoff between precision and recall, see Figure 1 . Note that here we only report results for LSA, RI, and RRI with dimensionality 1,000 as we found these to be consistently the best values across the tested parameters. Detailed results for the differently parametrized methods can be found in Appendix A2 in the supplemental material.

Recall-precision plot.

Recall-precision plot.

5.2. Correlation

Another evaluation approach is suggested by the observation that the similarity scores for nonmatches (true negatives) should be as low as possible and those for matches (true positives) as high as possible. We construct a new variable by assigning scores of 0 and 1 for true negatives and positives, respectively, and measure the association between this variable and the empirically measured similarity scores of the tested methods with the point-biserial correlation coefficient r pb ( Tate, 1954 ), which is equivalent to the Pearson correlation coefficient when numerical values are assigned to the dichotomous variable. Table 3 shows the averages of the r pb per method weighted by the number of candidates. Note that the absolute values are all very small and they were multiplied by 1,000 for display in the table.

Point-biserial correlation between ground truth and similarity methods, multiplied by 1,000

However, we have reason to doubt that r pb adequately measures the performance we are really interested in. The data set is strongly dominated by true negatives. Due to vocabulary mismatch arising from the very short texts and the bilingual data, the ordinary VSM methods’ similarity values are overwhelmingly often exactly 0 (cosine) or 1 ( L 1 distance). The other more sophisticated models can compute similarities other than 0 or 1 even if there are no common terms and produce values that are not massively concentrated on one end of the possible range of values. This leads to uninformatively high r pb for the VSM methods. We test this by computing the r pb between a constant (here 0) and the dichotomous match variable. The results are in column “always 0” in Table 3 . This method, equivalent to deterministic rejection of any candidate as irrelevant, achieves the best score according to r pb , confirming that the point-biserial correlation coefficient is not a useful evaluation criterion in this particular setting.

5.3. Global Similarity Scores

We have therefore devised the following evaluation method to test how well the scores for the similarity methods can differentiate between constitutive articles of theses and other articles. First, to establish comparability of similarity across methods, the scores for each method are z -transformed to obtain scores with mean 0 and standard deviation 1. Score values are expressed as differences from the overall mean in terms of standard deviations. Second, for theses in the sample for which true positives (associated articles) exist, we compute the average of the similarity scores of true positive cases by thesis. Third, for theses without associated articles, we simply select the highest standardized similarity score value. Fourth, we calculate the averages of the scores for the two groups: theses with articles and theses without articles. A good similarity method should have the average similarity value for theses with articles appreciably greater than the average for theses without articles. The results are presented in Table 4 , which show that the methods LSA, fastText, and trigram cannot achieve standardized scores for true positives greater than those for most similar articles of theses without associated articles. The two VSM variants and the RI methods exhibit much better performance. In particular, the L 1 distance VSM variant shows more than 9 SD differentiation on average, while the difference of the VSM cosine method is about 1 SD, and those of RI and RRI are 0.3 and 0.1 SD, respectively. Note that the value for L 1 distance VSM for the average standardized similarity is negative because the distance values for similar items are smaller than average values, unlike for the other methods, where the similarities are greater for more similar items.

Comparison of standardized similarity scores ( z -scores)

5.4. Local Similarity Ranks

Another issue is that the density of neighbors in similarity vector space is probably not uniform. If there are more and less dense regions, then the global similarity scores are less informative than the local scores, that is, the scores of similarities of candidates for a single thesis. Consequently, it seems more prudent to look at similarity ranks, stratified by thesis, rather than global similarity values. However, as there are no true positives for theses without constitutive articles, we cannot cover these cases by using this approach.

We proceed with the analysis of recall scores at different rank positions across the methods. Figure 2 shows the curves of the recall values for each considered similarity model for ranks 1 through 20; higher ranks are not interesting as any thesis only has a few integrated articles, if any. Again, these values only include observations of theses that do contain published material, not those that do not. RI shows the best performance here, followed by the baseline VSM methods. RI can achieve 0.8 recall at rank 20 on average, out of some 1,500 candidates per thesis.

Recall across similarity rank positions.

Recall across similarity rank positions.

To assess if the methods are biased in cross-language similarity measurement we split the validation data and compute recall by rank separately for item pairs with the same language and for pairs with different languages. Figure 3 displays the results. We find that the VSM methods, LSA, and fastText perform much worse in recall if items are of different languages. Trigram is somewhat less affected, RI is modestly affected, and RRI only slightly affected, exhibiting the least cross-language bias.

Recall across similarity rank positions by language concordance.

Recall across similarity rank positions by language concordance.

Finally, Table 5 shows the number of missing similarity values per method. There were three cases in which no method could calculate a score, as the processing of the Scopus texts left no terms to represent the documents. The supervised methods had additional missing values in cases when no terms in the processed text, either thesis or article, were present in the training data. However, the number of missing values is very small compared to the overall number of similarity calculations for all methods.

Missing similarity values

In summary, these results indicate that, somewhat unexpectedly, the baseline VSM similarity methods perform quite adequately, particularly when considering global similarity values. However, the best performing method when evaluating recall at low ranks is RI, whereas RRI performs a little worse. The pseudo-multilingual baseline method LSA shows only moderate performance, but clearly works to some extent, as can be seen from its results far exceeding random scores. FastText and trigram exhibit intermediate performance.

Before we proceed to the discussion of the results, a few limitations of this study need to be acknowledged. Because of the large number of choices that can be made in any information retrieval study, it is practically impossible to comprehensively cover all reasonable combinations of methods, settings, parameters, pre­processing steps, and so on. There are many different suggested methods for similarity calculation on vectors, term weighting, vector training, vocabulary pruning, and stop word removal. Parameters for supervised methods such as word context window size or parallel text alignment level could be varied. We have not tested more sophisticated methods for the composition of document-level representations from terms or of preprocessing steps such as decompounding, lemmatization, or word sense disambiguation. All of these factors could influence the results. To keep the scope of the study within feasible limits, we have chosen to apply only basic preprocessing and standard weighting and similarity methods. In the choice of evaluated methods, we have decided in favor of one representative state-of-the-art multilingual word vector method (fastText) and the conceptually attractive but little investigated trigram and RI methods.

ECRs, in particular doctoral students, publish many research outputs. Reliable quantitative estimates of their contribution to the total output of a country have hitherto been elusive, as has the assessment of the scientific impact of their research. Because all graduated PhDs have published a doctoral thesis, we have taken PhD thesis data as the starting point of our approach to quantify doctoral students’ research contributions. Cumulative doctoral theses consist of already published material; therefore it is crucial to identify their associated articles to quantify the citation impact of the doctoral research project as a whole. Moreover, the share of identified associated articles among all of a country’s articles can serve as a lower bound of the scientific contribution of doctoral students in terms of published output. Our prospective system for the identification of PhD thesis articles consists of a candidate article prefiltering stage and a subsequent automatic classification of candidate article records into those that are constitutive articles and those that are not. This second stage is anticipated to be accomplished by supervised machine learning algorithms trained and evaluated on sample data. For this matching of candidate associated articles to doctoral thesis records, not only the author names, authors’ institutional affiliations, and publication dates of candidate matches are important criteria but also the topical similarity of the research outputs. A good measurement of topic similarity can prove crucial in overcoming uncertainties in matching due to name ambiguities.

The text similarity calculation in this setting is demanding because of the brevity of the texts, the use of multiple languages, and the specialized scientific vocabulary. This rules out the unvalidated use of off-the-shelf solutions. No prior work in this setting has come to our attention, so this is a novel task. Following up on the call by Glavaš, Litschko et al. (2019) , the present study is also an instance of a “downstream evaluation” of cross-language distributional semantics models. To this end we have tested three baseline and two state-of-the-art short text similarity methods on a custom validation data set. We collected the necessary training and evaluation data sets and tested the five methods’ performance using evaluation measures adapted for the particularities of the data. While this study used German and English language text data, the findings can be informative for any other combination of two or perhaps more languages. Texts were preprocessed for all methods (except fastText) with language-specific stemming and in all similarity calculations (except trigram), tf-idf weights for terms were used.

Our results show that the long-established vector space model of text similarity measurement exhibits quite good performance for this task, likely benefiting from the fact that for one of the texts to be compared (Scopus article records) there will always be some English text and from the specialized scientific terminology. Once we look at the ranking results on the level of matching to individual theses, the limitations of the VSM become apparent as the RI method performs clearly better. The multilingual application of RI has so far only received limited attention ( Fernández, Esuli, & Sebastiani, 2016 ; Moen & Marsi, 2013 ; Sahlgren & Karlgren, 2005 ) but the present results are very encouraging. The findings also indicate that the trigram and fastText methods perform moderately well, while LSA is not competitive for this particular task. All methods suffer from some bias when the languages of the compared items differ, but to very different degrees, with RRI being almost unaffected. In conclusion, for the anticipated task of using text similarity as one of a set of features for identifying cumulative theses’ associated articles, a combination of VSM cosine similarity score and RI rank can be recommended, with the proviso that the VSM method is by its nature biased in favor of same-language texts. In addition, we can recommend the use of document records of cumulative doctoral theses and their constitutive articles as benchmark data sets for cross-language short text similarity tasks.

The author would like to thank the Information Management group of Deutsche Forschungsgemeinschaft for providing the bilingual project description data of funded DFG projects and Beatrice Schulz for her help in data collection. This research has made use of the PostgreSQL database system and the contributed extensions aggs_for_vecs and floatvec .

Funding was provided by the German Federal Ministry of Education and Research (grant numbers 01PQ16004 and 01PQ17001).

The author has no competing interests.

Data is made available at https://doi.org/10.5281/zenodo.4733850 and https://doi.org/10.5281/zenodo.4467633 except for proprietary data from Elsevier Scopus. Programming code is made available at https://gitlab.com/pdonner/ri_sql .

Germany does not maintain a central register of active PhD students or graduates ( Fräßdorf & Fräßdorf, 2016 ) and universities have only been required to systematically and comprehensively collect data on current doctoral students since 2017 ( Brauer, Oelsner, & Boelter, 2019 ). These new data are decentralized, not public, and only cover the period since 2017.

To give an extreme example, between 1996 and 2016, according to data collected from the German National Library, which was deduplicated and excludes medical theses, there are 48 persons named Thomas Müller who have authored a doctoral thesis. Of these, two different ones graduated from Heidelberg University in 1999.

Another approach would be to start with the available full-text electronic documents, apply automatic reference identification procedures to extract the cited sources and use these to identify associated articles by the thesis authors. This seems a promising alternative, albeit with the one drawback that some articles that have not been published at the time of the handing in of the dissertation are typically only cited in a provisional way. The larger problem is external to the data itself. As it stands, by no means all dissertations are published as publicly accessible electronic full-text versions. A reference extraction and matching approach would hence either be limited in coverage or need to collect and prepare theses published by publishing houses in book format or deposit copies from libraries. The effort required for this alternative approach was prohibitive in our project, so we decided to work with bibliographic data only.

Or word embeddings, or wordspace, or continuous word vectors, etc. The terminology has not yet stabilized.

Contrary to the standard RI method ( Sahlgren, 2005 ), the authors started out with assigning fixed index vectors to terms, using the same vector for the terms in both languages, rather than starting with index vectors for documents and constructing term index vectors from the document vectors.

For a very limited subset of recent cumulative theses, the DNB data contains information on included articles if these previously published works are completely incorporated in unchanged form. If the full-texts of the theses and the candidate articles are available, the true positive articles should be almost strict subsegments of the thesis they are part of. This suggests that pairs of cumulative dissertation and constituent articles could be ideal true positive gold standards for plagiarism detection methods.

German “Dr. med.” degree dissertations are considered incommensurable to other doctoral degree theses ( Senatskommission für Klinische Forschung, Deutsche Forschungsgemeinschaft, 2010 ; Wissenschaftsrat, 2014 ).

The data set is available at Donner (2021b) .

https://gepris.dfg.de/

https://p3.snf.ch/

https://cordis.europa.eu/

https://dict.tu-chemnitz.de/ , https://ftp.tu-chemnitz.de/pub/Local/urz/ding/de-en/

We also experimented with more sophisticated natural language processing by part-of-speech tagging, lemmatization, and extraction of noun phrases. This proved to be too computationally expensive for application to the entire corpus. The question of whether such higher-quality preprocessing can significantly improve results remains an open issue for further research.

https://www.postgresql.org/docs/12/pgtrgm.html

https://fasttext.cc/docs/en/aligned-vectors.html

We make our trained LSA, RI, and RRI models of dimensionality 1000 available in Donner (2021a) .

Author notes

Email alerts, affiliations.

  • Online ISSN 2641-3337

A product of The MIT Press

Mit press direct.

  • About MIT Press Direct

Information

  • Accessibility
  • For Authors
  • For Customers
  • For Librarians
  • Direct to Open
  • Open Access
  • Media Inquiries
  • Rights and Permissions
  • For Advertisers
  • About the MIT Press
  • The MIT Press Reader
  • MIT Press Blog
  • Seasonal Catalogs
  • MIT Press Home
  • Give to the MIT Press
  • Direct Service Desk
  • Terms of Use
  • Privacy Statement
  • Crossref Member
  • COUNTER Member  
  • The MIT Press colophon is registered in the U.S. Patent and Trademark Office

This Feature Is Available To Subscribers Only

Sign In or Create an Account

UCI Libraries Mobile Site

  • Langson Library
  • Science Library
  • Grunigen Medical Library
  • Law Library
  • Connect From Off-Campus
  • Accessibility
  • Gateway Study Center

Libaries home page

Email this link

Thesis / dissertation formatting manual (2024).

  • Filing Fees and Student Status
  • Submission Process Overview
  • Electronic Thesis Submission
  • Paper Thesis Submission
  • Formatting Overview
  • Fonts/Typeface
  • Pagination, Margins, Spacing
  • Paper Thesis Formatting
  • Preliminary Pages Overview
  • Copyright Page
  • Dedication Page
  • Table of Contents
  • List of Figures (etc.)
  • Acknowledgements
  • Text and References Overview
  • Figures and Illustrations
  • Using Your Own Previously Published Materials
  • Using Copyrighted Materials by Another Author
  • Open Access and Embargoes
  • Copyright and Creative Commons
  • Ordering Print (Bound) Copies
  • Tutorials and Assistance
  • FAQ This link opens in a new window

Securing Your Copyright

As the author of your thesis or dissertation, only you are legally entitled to authorize publication or reproduction of your intellectual property, although you may assign your rights to others.  Copyright is secured automatically when a work is created,  which is when it is fixed in a tangible form for the first time. Under present U.S. copyright law, the term of the copyright is the author's life plus 70 years.

Registering Your Copyright

Registering your copyright is optional, as your work is automatically copyrighted when it is published. If you wish to further protect your rights in a copyright dispute and to be eligible for damages caused by infringement, you may choose to register your copyright. You are eligible to register your copyright at any time within the term (author's life plus 70 years). ProQuest provides an optional copyright registration service for a fee ($75 in 2022). If you pay for this service, ProQuest will register your copyright and submit your manuscript to the Library of Congress. 

Creative Commons

U.S. Copyright is a collection of rights about how a work fixed in a tangible medium or expression can be used.  By applying a Creative Commons badge you let readers know which features of the document can be used/reused/recited with attribution and if there are limits . Application of the Creative Commons badge through Proquest is optional.

  • The CC-BY license allows for use/reuse with attribution.
  • Addition of NC states you are requesting no commercial reselling of the work without permission of the author (you).
  • The ND badge is no derivative work, which might be a graphic version or language translation, without formal permission. For example, if a print version is the original version, then an electronic version would need permission from the author to create and distribute. Journal articles may use CC-BY-NC-ND to allow the original PDF to be viewed but not commercially resold or derived works created.
  • << Previous: Open Access and Embargoes
  • Next: Ordering Print (Bound) Copies >>
  • Last Updated: Feb 20, 2024 2:09 PM
  • URL: https://guides.lib.uci.edu/gradmanual

Off-campus? Please use the Software VPN and choose the group UCIFull to access licensed content. For more information, please Click here

Software VPN is not available for guests, so they may not have access to some content when connecting from off-campus.

Intensification of evaporation of uranium hexafluoride

  • Chemical Engineering Science and Chemical Cybernetics
  • Published: 14 August 2013
  • Volume 47 , pages 499–504, ( 2013 )

Cite this article

  • A. M. Belyntsev 1 ,
  • G. S. Sergeev 2 ,
  • O. B. Gromov 2 ,
  • A. A. Bychkov 1 ,
  • A. V. Ivanov 2 ,
  • S. I. Kamordin 3 ,
  • P. I. Mikheev 4 ,
  • V. I. Nikonov 2 ,
  • I. V. Petrov 1 ,
  • V. A. Seredenko 2 ,
  • S. P. Starovoitov 1 ,
  • S. A. Fomin 1 ,
  • V. G. Frolov 1 &
  • V. F. Kholin 2  

125 Accesses

4 Citations

Explore all metrics

The theoretical mechanism of the sublimation of uranium hexafluoride are considered. The most contribution to the rate of evaporation of UF 6 is introduced by the conductive mode of heat exchange. Various modes of the intensification of the evaporation of uranium hexafluoride during the nitrogen supply in pulse mode to the product mass are investigated. The nitrogen supply results in the turbulization of gas flow within a vessel (Re = 2500–4000) and significantly increases the rate of evaporation of uranium hexafluoride with the substantial decrease in a weight of the nonevaporable residue of 5.6–1.0 kg. The complex application of the pulse nitrogen supply in combination with heating the bottom of the vessel is the most effective method for evaporating uranium hexafluoride. The rate of evaporation of UF6 increases by a factor of almost four in comparison with the design mode. The developed methods are applied in industry and provide the stable operation of Saturn reactors during the conversion of uranium hexafluoride into its dioxide.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Similar content being viewed by others

kumulative dissertation copyright

Production of Uranium Hexafluoride with Low 234U Content in a Cascade with Intermediate Product

V. A. Palkin

Plasma-Chemical Treatment of Process Gases with Low-Concentration Fluorine-Containing Components

H. S. Park, S. P. Vaschenko, … D. Yu. Batomunkuev

Obtaining Hydrogen Fluoride During the Interaction of Uranium Hexafluioride with Hydrogen and Oxygen in a Combustion Regime. Experiment

D. S. Pashkevich, Yu. I. Alekseev, … V. V. Kapustin

Gromov, B.V., Vvedenie v khimicheskuyu tekhnologiyu urana (Introduction to Uranium Chemical Technology), Moscow: Atomizdat, 1978.

Google Scholar  

Sergeev G.S. Study of the evaporation of uranuym hexafluoride from solid and liquid phases and ways of intensifying this process, Cand. Sci. (Eng.) Dissertation , Moscow: All-Union Research Inst. of Chemical Technology, 1970.

Lykov, A.V., Teoriya sushki kapillyarno-poristykh kolloidnykh materialov pishchevoi promyshlennosti (Theory of Drying of Capillary-Porous Colloid Materials of the Food Industry), Moscow: Gostekhizdat, 1948.

Sushkin, I.N., TeplotekhnikaF (Heat Engineering), Moscow: Metallurgiya, 1973.

Morachevskii, A.G. and Sladkoe, I.B., Fizikokhimicheskie svoistva molekulyarnykh neorganicheskikh soedinenii. Spravochnik (Physical and Chemical Properties of Molecular Inorganic Compounds: A Handbook), Leningrad: Khimiya, 1987.

Katz, J. and Rabinovich, E., The Chemistry of Uranium , New Yorl: McGraw-Hill, 1951.

Kasatkin, A.G., Osnovnye protsessy i apparaty khimicheskoi tekhnologii , (Fundamentals of Chemical Engineering Science), Noscow: Khimiya, 1971.

Bychkov, A.A., Nikonov, V.I., Seredenko, V.A., et al., Industrial tests and commercialization of fluorohydrocarbon evaporation from 1 m3 cylinders using nitrogen pulsing into the cylinder, in Sb. rabot MSZ i OAO VNIIKhT , (Collected Papers of MSZ and VNIIKhT), Moscow, 2005.

Petrov, N.V., Bychkov, A.A., Sergeev, G.S., et al., RF Patent 2264987, 2005.

Petrov, N.V., Bychkov, A.A., Seredenko, V.A., et al., RF Patent 2326053, 2008.

Download references

Author information

Authors and affiliations.

Engineering Works, Elektrostal’, Moscow oblast, Russia

A. M. Belyntsev, A. A. Bychkov, I. V. Petrov, S. P. Starovoitov, S. A. Fomin & V. G. Frolov

Leading Research Institute of Chemical Technology, Moscow, Russia

G. S. Sergeev, O. B. Gromov, A. V. Ivanov, V. I. Nikonov, V. A. Seredenko & V. F. Kholin

Bochvar All-Russia Research Institute of Inorganic Materials, Moscow, Russia

S. I. Kamordin

Bauman Moscow State Technical University, Moscow, Russia

P. I. Mikheev

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to A. M. Belyntsev .

Additional information

Original Russian Text © A.M. Belyntsev, G.S. Sergeev, O.B. Gromov, A.A. Bychkov, A.V. Ivanov, S.I. Kamordin, P.I. Mikheev, V.I. Nikonov, I.V. Petrov, V.A. Seredenko, S.P. Starovoitov, S.A. Fomin, V.G. Frolov, V.F. Kholin, 2011, published in Khimicheskaya Tekhnologiya, 2011, Vol. 12, No. 11, pp. 675–681.

Rights and permissions

Reprints and permissions

About this article

Belyntsev, A.M., Sergeev, G.S., Gromov, O.B. et al. Intensification of evaporation of uranium hexafluoride. Theor Found Chem Eng 47 , 499–504 (2013). https://doi.org/10.1134/S0040579513040040

Download citation

Received : 25 January 2011

Published : 14 August 2013

Issue Date : July 2013

DOI : https://doi.org/10.1134/S0040579513040040

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • uranium hexafluoride
  • sublimation
  • turbulization of gas flow
  • rate of evaporation of UF 6
  • conversion UF 6 within N 2
  • Find a journal
  • Publish with us
  • Track your research

19th Edition of Global Conference on Catalysis, Chemical Engineering & Technology

Victor Mukhin

  • Scientific Program

Victor Mukhin, Speaker at Chemical Engineering Conferences

Title : Active carbons as nanoporous materials for solving of environmental problems

However, up to now, the main carriers of catalytic additives have been mineral sorbents: silica gels, alumogels. This is obviously due to the fact that they consist of pure homogeneous components SiO2 and Al2O3, respectively. It is generally known that impurities, especially the ash elements, are catalytic poisons that reduce the effectiveness of the catalyst. Therefore, carbon sorbents with 5-15% by weight of ash elements in their composition are not used in the above mentioned technologies. However, in such an important field as a gas-mask technique, carbon sorbents (active carbons) are carriers of catalytic additives, providing effective protection of a person against any types of potent poisonous substances (PPS). In ESPE “JSC "Neorganika" there has been developed the technology of unique ashless spherical carbon carrier-catalysts by the method of liquid forming of furfural copolymers with subsequent gas-vapor activation, brand PAC. Active carbons PAC have 100% qualitative characteristics of the three main properties of carbon sorbents: strength - 100%, the proportion of sorbing pores in the pore space – 100%, purity - 100% (ash content is close to zero). A particularly outstanding feature of active PAC carbons is their uniquely high mechanical compressive strength of 740 ± 40 MPa, which is 3-7 times larger than that of  such materials as granite, quartzite, electric coal, and is comparable to the value for cast iron - 400-1000 MPa. This allows the PAC to operate under severe conditions in moving and fluidized beds.  Obviously, it is time to actively develop catalysts based on PAC sorbents for oil refining, petrochemicals, gas processing and various technologies of organic synthesis.

Victor M. Mukhin was born in 1946 in the town of Orsk, Russia. In 1970 he graduated the Technological Institute in Leningrad. Victor M. Mukhin was directed to work to the scientific-industrial organization "Neorganika" (Elektrostal, Moscow region) where he is working during 47 years, at present as the head of the laboratory of carbon sorbents.     Victor M. Mukhin defended a Ph. D. thesis and a doctoral thesis at the Mendeleev University of Chemical Technology of Russia (in 1979 and 1997 accordingly). Professor of Mendeleev University of Chemical Technology of Russia. Scientific interests: production, investigation and application of active carbons, technological and ecological carbon-adsorptive processes, environmental protection, production of ecologically clean food.   

Quick Links

  • Conference Brochure
  • Tentative Program

Watsapp

COMMENTS

  1. Thesis by publication (cumulative dissertation)

    If your dissertation contains complete published or submitted papers, there are several copyright issues to consider. Expand all. Collapse all. 1. Checking the legal conditions. 2. Phrase embedding. 3. Agreement of the co-authors.

  2. Copyrighting

    When you submit your thesis or dissertation, ProQuest charges a fee for this service ($55, subject to change). The service includes preparing an application in your name, submitting your application fee, depositing the required copy or copies of the manuscript, and mailing you the completed certificate of registration from the Library of Congress.

  3. Copyright for Graduate Students: Theses and Dissertations

    Except where otherwise indicated, original content in this guide is licensed under a Creative Commons Attribution (CC BY) 4.0 license.You are free to share, adopt, or adapt the materials. We encourage broad adoption of these materials for teaching and other professional development purposes, and invite you to customize them for your own needs.

  4. Copyright Complications

    Theses and dissertations which contain embedded PJAs as part of the formal submission can be posted publicly by the awarding institution with DOI links back to the formal publications on ScienceDirect. Source 2 - FAQ. Can I include/use my article in my thesis/dissertation? Yes.

  5. Dissertation Copyright

    122 College Hall University of Pennsylvania Philadelphia, PA 19104 215.898.5000

  6. PDF Cumulative versus monographic Dissertation

    Requirements of the PhD regulations (2008) for cumulative dissertations The regulation does not fix a certain number of publications. It just says " § 6 (4) Die Dissertation kann mit Zustimmung der Betreuerin / des Betreuers und der / des Vorsitzenden des Promotionsausschusses als kumulative Arbeit eingereicht werden. Dabei sind mehrere

  7. PDF Guidelines for Cumulative Dissertations

    The dissertation must contain a concluding discussion that r efers to all chapters. This discussion should explain how the chapters contribute to answering the research ques-tion(s) of the dissertation as stated in the introduction. In addition, the overall methodology should be discussed.

  8. Understanding Copyright

    When in doubt, consult Carnegie Mellon's Center for Technology Transfer and Enterprise Creation (CTTEC), 412-268-7393 or [email protected]. Neither the University Libraries nor ProQuest/UMI require copyright transfer to publish your dissertation. Both require only the non-exclusive right to reproduce and distribute your work.

  9. Writing a cumulative dissertation

    Writing a cumulative dissertation. 2019-10-21. A cumulative dissertation is a collection of articles which have been published in recognised scientific journals or accepted for publication. My PhD dissertation is a cumulative one and in this blog post I describe its structure and things to pay attention to when writing your own.

  10. PDF Kumulative Dissertation

    Supplement/change in journal series for a cumulative dissertation dated 08.11.2023 The list of recognized journals (paragraph 3) of the formal minimum requirements for a cumulative dissertation of 31.01.2008 (FBR resolution of 6.2.2008) is supplemented with the journal "Environmental Data Science" of the Cambridge University press.

  11. PDF Guidelines for Cumulative Dissertations

    Guidelines for Cumulative Dissertations. A cumulative dissertation at the School of Business, Economics and Social Sciences consists of: An introduction. At least three scientific papers that are either published or at a level suitable for publication in academic journals. Each paper should make a substantial original contribution.

  12. thesis

    Cumulative dissertation is probably a literal translation of the German Kumulative Dissertation, which denotes a thesis by publication, compilation thesis or article thesis, i.e., a thesis which typically consists of some peer-reviewed publications, an introduction, and a conclusion.The alternative to this is a monograph thesis, which is written separately as a coherent monolithic work and ...

  13. Collection of articles

    A thesis as a collection of articles [1] or series of papers, [2] also known as thesis by published works, [1] or article thesis, [3] is a doctoral dissertation that, as opposed to a coherent monograph, is a collection of research papers with an introductory section consisting of summary chapters. Other less used terms are "sandwich thesis" and ...

  14. How to structure a cumulative dissertation: Five strategies

    A cumulative dissertation consists of a series of papers published, or submitted for publication during the timeframe of the doctorate study. In addition to the papers, the PhD student is required to create an overarching argument that is to be presented in the thesis's introductory and conclusion chapters. The number of papers to be ...

  15. Identifying constitutive articles of cumulative dissertation theses by

    Abstract. Cumulative dissertations are doctoral theses comprised of multiple published articles. For studies of publication activity and citation impact of early career researchers, it is important to identify these articles and link them to their associated theses. Using a new benchmark data set, this paper reports on experiments of measuring the bilingual textual similarity between, on the ...

  16. PDF Microsoft Word

    Requirements for cumulative dissertations1. cumulative dissertation consists of four scientific articles/academic papers and one synopsis. Three of the scientific articles/academic papers must be published in internationally renowned peer-reviewed academic journals or equivalent publications that reflect the standards of the discipline.

  17. Copyright and Creative Commons

    If you wish to further protect your rights in a copyright dispute and to be eligible for damages caused by infringement, you may choose to register your copyright. You are eligible to register your copyright at any time within the term (author's life plus 70 years). ProQuest provides an optional copyright registration service for a fee ($75 in ...

  18. PDF (Kumulative) Dissertation

    (3) Die Dissertation kann auch auf Vorveröffentlichungen oder zur Veröffentlichung eingereichten Arbeiten basieren („kumulative publikationsbasierte Dissertation"). Sie muss zu einem einer monographischen Dissertation entsprechenden Erkenntnisfortschritt beitragen und den übrigen Anforderungen nach Absatz 1 entsprechen.

  19. Kumulative Dissertation

    Kumulative Dissertation: https://business-and-science.de/kumulative-dissertation/In diesem Video erfährst du, was eine Publikationsdissertation (auch Sammeld...

  20. Intensification of evaporation of uranium hexafluoride

    Gromov, B.V., Vvedenie v khimicheskuyu tekhnologiyu urana (Introduction to Uranium Chemical Technology), Moscow: Atomizdat, 1978. Google Scholar . Sergeev G.S. Study of the evaporation of uranuym hexafluoride from solid and liquid phases and ways of intensifying this process, Cand. Sci. (Eng.) Dissertation, Moscow: All-Union Research Inst. of Chemical Technology, 1970.

  21. Victor Mukhin

    Catalysis Conference is a networking event covering all topics in catalysis, chemistry, chemical engineering and technology during October 19-21, 2017 in Las Vegas, USA. Well noted as well attended meeting among all other annual catalysis conferences 2018, chemical engineering conferences 2018 and chemistry webinars.

  22. Active carbons as nanoporous materials for solving of environmental

    Catalysis Conference is a networking event covering all topics in catalysis, chemistry, chemical engineering and technology during October 19-21, 2017 in Las Vegas, USA. Well noted as well attended meeting among all other annual catalysis conferences 2018, chemical engineering conferences 2018 and chemistry webinars.

  23. Intermittency and concentration probability density function in

    PDF | On Sep 1, 1986, Vladimir Sabelnikov published Intermittency and concentration probability density function in turbulent flows, Thesis Doctor en Science, Moscow Institute of Physics and ...