It's really a pity that they do this now. Some of their older papers had actually quite some valuable information, comments, discussions, thoughts, even commented out sections, figures, tables in it. It gave a much better view on how the paper was written over time, or how even the work processed over time. Sometimes you also see some alternative titles being discussed, which can be quite funny.
> Some of their older papers had actually quite some valuable information, comments, discussions, thoughts, even commented out sections, figures, tables in it.
I think you answered your own question.
Chinjut 10 hours ago [-]
What question?
JohnKemeny 9 hours ago [-]
I think I read the comment as being sceptical as to why. I withdraw my comment in that form.
toolslive 15 hours ago [-]
Maybe papers need to be put under version control.
westurner 6 hours ago [-]
FigShare and Zenodo grant (DataCite) DOIs for git commit tags.
Maybe papers need to contain executable test assertions.
enriquto 17 hours ago [-]
> Removes all comments from your code (yes, those are visible on arXiv and you do not want them to be).
Why not? I love to peek at .tex file comments, and secretively hope that somebody somewhere is reading mine...
pantalaimon 14 hours ago [-]
Those comments might also explain how some cool figure was done
lou1306 16 hours ago [-]
Ehh sometimes you have additional results or insightful remarks that simply don't fit into the page limits. You may want to keep those for yourself and use them for a separate publications rather than give them away.
JohnKemeny 15 hours ago [-]
Well, you don't have page limits on arXiv, though.
michaelmior 14 hours ago [-]
This is true, but arXiv submissions are often prepared with a target venue in mind that does have page limits.
JohnKemeny 11 hours ago [-]
Also true, but the arxiv version is often (in my experience) containing the entire paper. Indeed, many conferences ask people to submit the full version to arXiv.
lou1306 7 hours ago [-]
Aye, but in this context "full version" usually means "a version with more detailed proofs/results related to the paper's contributions", rather than "a version with additional contributions".
michaelmior 10 hours ago [-]
Interesting. I know it frequently happens, but I've never seen a conference explicitly make that request.
JohnKemeny 9 hours ago [-]
Here's an example (ICALP):
> Authors are strongly encouraged to also make full versions of their submissions freely accessible in an on-line repository such as ArXiv, HAL, ECCC.
This is from the call for papers a few years ago. The wording has changed in recent CFPs, due to employing (weak) double-blind reviewing.
They still allow uploading to arXiv (with full names and affiliations) despite being anonymous.
jpfr 17 hours ago [-]
Many researchers learn LaTeX by looking at the idioms used for the papers they really like.
That includes code for Tikz figures.
I hope people will use this tool only to remove the inadvertent disclosure of commented regions and to reduce the file size. But keep the LaTeX source intact otherwise!
WanderPanda 9 hours ago [-]
It needs to be intact, the pdf is rendered by the arxiv backend based on the source
semi-extrinsic 9 hours ago [-]
You can upload only the PDF on ArXiv. Useful when you for some reason (e.g. client request) publish in certain engineering conferences that only allow Word submissions...
generationP 19 hours ago [-]
What is the point of concealing tikz source code? It increases the size of the source archive and undermines accessibility.
DominikPeters 13 hours ago [-]
It's also nice for other people to reuse and adapt your figure, or include it in beamer presentations.
MengerSponge 19 hours ago [-]
And obfuscating "raw simulation data"? It's not pro-research fraud, but it's what a person who was pro-research fraud would prefer.
mattkrause 7 hours ago [-]
Agreed that the phrasing is suspicious!
However, it’s pointless or even counterproductive to embed the raw high-resolution data in the paper because it doesn’t show up in the rendered copy but balloons its size. For 6.5” (i.e., full width) figure printed at 300 dpi, you can only show 2100 points horizontally—-and realistically a lot less. Upload the raw traces somewhere and add a link.
Source: As a grad student, I stupidly turned a simple poster into a multi-gigabyte monstrosity by embedding lots of raw data. The guy at the print shop was not happy when it crashed his large-format printer!
DominikPeters 13 hours ago [-]
To remove comments, one can also run, for example `latexpand --empty-comments --keep-includes --expand-bbl document.bbl document.tex > document-arxiv-v1.tex`. Latexpand should come pre-installed with texlive. Without the `--keep-includes` option, it also flattens the tex files into one.
But I'd consider removing comments by hand and leaving any comments that are potentially insightful.
sabjut 13 hours ago [-]
I wish journals would start accepting Typst[0] files. It is definitely the format of the next decade in my opinion. It's both open source and highly performant.
Sadly existing legacy structures prevent it from gaining the critical mass needed for it to thrive just yet.
Or, don't put your stuff on the arXiv, but put it on zenodo. You also get a DOI, and you can just publish the PDF, not the source. You can even restrict access to the PDF, and create share links with access to it.
evanb 12 hours ago [-]
You get a DOI on the arXiv. You can just publish the PDF on the arXiv, but this is a sure sign you are a crackpot.
auggierose 12 hours ago [-]
You cannot just publish the PDF, they have checks that make sure that you didn't produce your PDF with LaTeX. There are probably ways to get around that, but why? Just use zenodo instead.
Or just publish on zenodo, without all that fuss. The reasons the ArXiv gives may be good from their point of view, but if you don’t care too much about that but have your own good reasons for not wanting to publish your source, then zenodo is a great and in many respects superior alternative, no questions asked.
auggierose 12 hours ago [-]
You mean, like Grisha Perelman?
evanb 10 hours ago [-]
There are exceptions to every rule.
auggierose 10 hours ago [-]
If you are “sure” I expect 100% correctness.
evanb 9 hours ago [-]
See, every rule has an exception.
auggierose 5 hours ago [-]
Let's assume that every rule has an exception. Then this rule must have an exception as well, so there is a rule with no exception. That is a contradiction.
So most definitely, there are some rules with no exception. The ones you are sure about should be among them.
frumiousirc 12 hours ago [-]
arXiv issues DOIs for submissions.
auggierose 12 hours ago [-]
I didn't say otherwise. In fact, the "also" is meant to express exactly that.
E.g. from https://arxiv.org/abs/1804.09849:
%\title{Sequence-to-Sequence Tricks and Hybrids\\for Improved Neural Machine Translation} % \title{Mixing and Matching Sequence-to-Sequence Modeling Techniques\\for Improved Neural Machine Translation} % \title{Analyzing and Optimizing Sequence-to-Sequence Modeling Techniques\\for Improved Neural Machine Translation} % \title{Frankenmodels for Improved Neural Machine Translation} % \title{Optimized Architectures and Training Strategies\\for Improved Neural Machine Translation} % \title{Hybrid Vigor: Combining Traits from Different Architectures Improves Neural Machine Translation}
\title{The Best of Both Worlds: \\Combining Recent Advances in Neural Machine Translation\\ ~}
Also a lot of things in the Attention is all you need paper: https://arxiv.org/abs/1706.03762v1
I think you answered your own question.
Maybe papers need to contain executable test assertions.
Why not? I love to peek at .tex file comments, and secretively hope that somebody somewhere is reading mine...
> Authors are strongly encouraged to also make full versions of their submissions freely accessible in an on-line repository such as ArXiv, HAL, ECCC.
This is from the call for papers a few years ago. The wording has changed in recent CFPs, due to employing (weak) double-blind reviewing.
They still allow uploading to arXiv (with full names and affiliations) despite being anonymous.
That includes code for Tikz figures.
I hope people will use this tool only to remove the inadvertent disclosure of commented regions and to reduce the file size. But keep the LaTeX source intact otherwise!
However, it’s pointless or even counterproductive to embed the raw high-resolution data in the paper because it doesn’t show up in the rendered copy but balloons its size. For 6.5” (i.e., full width) figure printed at 300 dpi, you can only show 2100 points horizontally—-and realistically a lot less. Upload the raw traces somewhere and add a link.
Source: As a grad student, I stupidly turned a simple poster into a multi-gigabyte monstrosity by embedding lots of raw data. The guy at the print shop was not happy when it crashed his large-format printer!
But I'd consider removing comments by hand and leaving any comments that are potentially insightful.
Sadly existing legacy structures prevent it from gaining the critical mass needed for it to thrive just yet.
[0] https://typst.app/
If you disagree with their good reasons https://info.arxiv.org/help/faq/whytex.html to submit the TeX you might be granted an exception.
So most definitely, there are some rules with no exception. The ones you are sure about should be among them.
...and here the tool to quickly inspect comments that were left in the LaTeX