The place to ask your Arch questions

Some people prefer mailing lists to ask questions, some irc, some wiki.

Please make some effort however to search for the answers. There is a 99.9% chance your question is already answered either on this Wiki (a good place to look at is Arch Recipes) or on gnu-arch-users mailing list, see archives. (I think those links to the mailing list don't work?)

Table of Contents

Equivalent of cvs import

Question: Learning Arch commands for CVS users seems to say that import of a tree of 1000 files will require 1000 separate tla add commands, or in other words: there is no simple equivalent of cvs import. Is that really true? If so, some more explanation of that point might be worthwhile.

Answer: You are incorrect about 1000 separate commands -- multiple files can be added in a single command. However, you are correct that there is no easy way to execute the large number of commands that would be needed.

To automatically import an existing tree, you will need something like the following:

 $ tla inventory -s -B --names | xargs -n 1 tla add

Note: This command will not work on files with spaces and other characters tla dislikes. Adding --unescaped to the inventory command is a start, but then you need to stop xargs from separating the strings that have whitespace in them. Anyone have a real fix?

tla should acquire a -0 option as found in other utilities often used with xargs.

Workaround for filenames with spaces, but without a -0 tla option (requires a bourne shell):

 $ tla inventory -s -B --names --unescaped | while read fname; do tla add "$fname"; done

Alternative workaround:

$ tla inventory -s -B --names --unescaped |tr '\n' '\000'|xargs -0 -ifilename tla add "filename"

Storing binary files

Question: I couldn't find any information on how Arch handles binary files, or how much space it uses in general. How does it compare to CVS in this regard? If I have a terabyte of versioned source under CVS (including some binaries), would it use a similar amount of space if migrated to Arch? What would Arch's performance be like in this case? This isn't covered in anything I've seen in my one hour investigation of Arch.

Answer: Every revision in an archive is stored as a tar.gz file. Text files are stored as patches to the original, but any file that 'diff' reports as binary is instead stored twice: Once as its original, and once as its modified version. (Reportedly, diff considers any file containing NUL bytes to be binary.) Like a patch, this helps ensure integrity by ensuring that the right data is being replaced -- but with no sense of context like a patch.

As far as storage goes, I have just used cscvs to convert an old company project into arch to answer this question. It was a Win32 Delphi program that uses on-disk binary database files (Borland Paradox) to store data in multiple small files in the 'db' directory. (These typically held almost no data by default except their interrelationships and one 120k file, but due to relationships, they changed extremely often.)

(All figures include the output size of 'du --apparent' and plain 'du' as 'on disk'.)

The CVS repository was 15935k (17164k on disk). The complete arch repository (441 patches) was 4035k (8340k on disk). The final checked-out version in arch was 5520k (9312k on disk).

Getting rid of the database, the CVS repository was 7799k (8348k on disk). The arch repository (333 patches) was 4999k (also 8348k on disk). The checked-out copy was 3767k (5840k on disk).

Finally, for the database (binary), the CVS repository was 8137k (8816k on disk). The arch repository (128 patches) was 1265k (2488k on disk). And the checked-out copy was 1688k (3480k on disk).

The greatest space benefit for arch is likely that almost all data in the repository is compressed. The downside is that it creates many directories and files, which can (on the more common filesystems) squander some space.

If you begin reaching maximum disk capacity for your archive, arch offers the option of Cycling Archives. Open a new archive, tag the projects you want to keep available or continue working on, cacherev their base-0 tags (done by default with tla 1.2), and then remove the old archive onto high-capacity backup media.

What about Arch on Windows?

Question: Why hasn't Arch been recompiled under Cygwin? Sounds like it Should Just Work. Or would there be perpetual CR-LF <--> LF conversion problems? CVS provides a solution to that problem for free.

Answer: Cygwin tla 1.2 binaries are now available. Also, Cygwin is not the only solution any more, but rather, one of three possible usage styles. Please see Native WIN32 Support.

Spaces in file names?

Question: I've read somewhere that Arch doesn't allow spaces in source filenames and directories. Is that true? It's a showstopper for this potential user, if so.

Answer: No longer true. Support for spaces in source filenames was released in 1.2.1.

Equivalent of CVS keywords

Question: Is there a way to make arch automatically update some keywords in the source when updating/committing, like CVS does with $Log$ and $Id$ for example ?

Answer: No. See This post in the mailing list for more explanations.

Answer #2: Not now, but it could be implemented. If an tla export command is implemented that only exports the non-tla files and at the same time replaces $Keywords$ while exporting. This could also simplify the How to make a distribution tarball recipe.

Answer #3: No. And you don't really want them either. I just worked on a 848k (about 20k lines) diff that was mostly spurious keyword expansion differences. One file even had two $Log$ expansions. The diff touched 1107 files, out of which only 13 were really modified in an interesting way.

Forward Vs Backward patches

Question: Why is arch storing patches from the initial revision to the latest, instead of keeping the latest revision, and backward patches to the initial one. (I think this is the way CVS does).

Answer: arch operates under the principle that one should never try to alter the past. If you are continually deleting your latest revision and replacing it with a new one, like CVS, it doesn't matter if you still provide a consistent view of the past -- you are still changing it. If a corruption occurs in the head version, you have lost both your present and your past (up to the last backup). If a corruption occurs in a previous version, you've lost everything before that -- but you won't ever know it until you try to retrieve it, by which time it may be so old that you don't have any backups left.

Speaking of which, forward-patching is also more bandwidth-conservative (for typical use) and backup-friendly. With the protocols arch uses to access archives, backwards-patching would require uploading a whole tree every single commit. And incremental backups only need to add small patches, rather than a whole version of the program at every backup.

Forward-patching loses to backwards-patching when it comes to efficiency. However, there are many ways to optimise it, and most of those have been implemented. Cacherevs and revision libraries are the two best ways right now, and give the choice between using cache space on the server or using cache space on the development platform.

Content of the {arch}/ directory

Question: What is there inside the {arch}/ directory ? Why is it so big ? Is it reasonable to link it to /tmp ? Is it possible to recover it easily in case I lose it ?

Answer: The arch directory contains metadata about your source archive. This includes

Pristine trees are the most common reason for {arch} to get very large. Pristine trees are, as the name suggests, copies of previous revisions. They're kept around so that you can quickly generate diffs for merging, seeing what changed, etc. (It's very similar to svn-base in subversion.) Normally you will only have a pristine tree for one recent revision, but it's possible there will be more. You can find out by running tla pristines. See section Arch Storage in this wiki.

It's always safe (???) to delete unlocked pristines, just using rm. They'll be recreated when they're needed. However, this might require downloading those old versions from the archive. (So there is a space/time tradeoff.)

You can avoid keeping pristine trees under {arch} by using a "revision library". This allows pristine trees to be share amongst all your working directories. If your working directory is on a slow medium (NFS), it is a good idea to put your revision library on a faster one. If your archive is easily reachable (cheap network connection or local disk), then, the revision library can be rebuilt easily at any time, so it is reasonable to put it on a medium without backups (possibly /tmp).

On some filesystems, it's big because the merge history is in the form of many small files in many directories. On Linux, the reiserfs filesystem handles small files efficiently.

It's not reasonable to link to /tmp. You don't want to lose it.

If you do lose it, you can recreate it with the init-tree command, but you will lose any uncommitted merge history. You could also check out a full tree of the last version with tla get, copy all your uncommitted changes to the checked out tree, and commit from there.

Equivalent of CVS release tags

Question: Hello, I'm trying to figure out the best way to reproduce 'tagging releases' in CVS. Tonight on IRC it was suggested that I use 'configs' for this. Would this be the way to do that?

# These steps only need to be done once for project
$ mkdir configs
$ echo './      mark@summersault.com--2004/cgi-uploader--main--0'>configs/uploader.arch
# Now to create a new release tag based on the current patch level
tla cat-config --snap --output configs/release-0.61_02 uploader.arch

From there, I would just have to make sure that my configs are preserved, and I suppose 'build-config' would be used to fetch the release in the future. I haven't gotten that far yet. :)

This doesn't seem to work; or at least, if it does, I haven't found out enough to make it work. When I do build-config I get "unable to set aside conflicting directory". How is this supposed to work? RichardKettlewell

Use of the "keyword: " field in patch-logs

Question: What's the use of the "keywords: " field in the log file ? I didn't find any tla features using this field, so, I suppose this is just a place where you can put a few words, without any particular features, but is there any convention about what kind of keywords we are supposed to use ?

Answer: Not used currently, but Tom Lord has "big plans for it" http://mail.gnu.org/archive/html/gnu-arch-users/2004-05/msg00563.html

Which patch

Question: I'm trying to use arch on a Solaris box, but it's trying to use the native Solaris version of patch in /usr/bin instead of GNU patch in /usr/local/bin. Trying to tla commit fails unless I muck with the $PATH to include /usr/local/bin first. Is there some to tell tla which patch to use?

Answer: When compiling tla, give the --with-gnu-patch option to the configure script.

Another option is to write a wrapper script setting a correct $PATH before calling tla.

And last possibility : add $HOME/bin at the beginning of you $PATH, and put a symbolic link to the right patch there.

Where to run TLA commands?

Question: I have read through a lot of documentation, and I got the impression that TLA must always be called from the top directory of a source tree. Is this correct, or is it true only for some commands, or not at all? TIA.

Answer: No commands must be run at the top of the tree; subdirectories of a project tree are also okay. All of the commands listed under Project Tree Commands in help must be run in a tree, except for init-tree. All of the Project Tree Inventory commands must be run in a tree. All of the Patch Log commands must be run in a tree. Also, import, commit, update, star-merge, sync-tree, missing, join-branch and replay.

Combining the archive-setup and init-tree command

Question: Why the init-tree command does not do the archive-setup part? I understand that from the developer perspective they operate on different directories - so it is easier to write them as separate commands, but from the user perspective it only adds one unintuitive command to the whole process.

Answer: Good question. Note that import (and tag) will do the archive-setup part, if given a -S parameter.

No documentation?

Question: I have recently installed arch on Linux, but I cannot find any docs - neither man, nor info, nor anything else was installed. Have I missed something? BTW, I have read that tla is supposed to be self-documenting; however, when I say "tla command --help" all I get is a short list of options, without any indication what "command" really is good for.

Answer: Tla's documentation is all inline documentation. Try "tla command -H" and off course "tla help" for more detailed help. The other sources of info are the tutorial and this wiki.

Move archive to new location

Question: I currently have my archive in ~/.arch-archives but want to move it to /var/local/tla/archives or so. Is it sufficient to just move the directory there and invoke "tla register-archive --force", or do I need to do more?

Answer: No, you don't need anything more. Even in a distributed development, moving archive only requires your partners to re-register the new archive location. The important is to keep the same archive name.

Revisions of dot files

Question: Quite often I find myself recompiling my kernel, and I thought it might be a good idea to keep track of changes to the kernel .config. So I made a new archive, did tla add .config, id-tagging-method is explicit by default, but when I tried to tla import, arch complained about a violated naming convention. I tried adding source \..* to .arch-inventory, but now tla import complains that .arch-inventory is source but has not been added. Is ARCH the right tool for keeping track of my kernel .config (or any other dot-file)? If it is, what is the simplest way of doing so? And finally, why should the naming convention matter for explicit tagging at all?

Answer: .arch-inventory is used internally by tla, but is still a file like the others in the archive, so you need to add it with tla add .arch-inventory. And yes, arch is suited to keep track of your .config file. You could also have added source ^\.config$ to {arch}/=tagging-method. Another option is to have a config file, without the dot, in your archive, and to copy or link it to /src/linux/.config on demand. That's a matter of taste, but I don't like having hidden files in my archive.

Question: Why should the naming convention matter for explicit tagging at all? Here is another problematic situation with naming convention. I have a project where I need to keep some .o or .a files in the archive as source files (eg:I cant recompile those) but most .o and .a files elsewhere are just result of compilation and should not be in the archive. AFAICS, the only way to reconcile the naming convention with that is to explicitely name all those .o files in the =tagging-method file which is stupid since I already did a tla add for each of those files. Why doesnt arch understand that once I did tla add on a file, this file is a source file ?

Additional Comment: The naming conventions also seem to be woefully C/C++/Java/...-centric. E.g., when you are doing Lisp, a core is likely to be valuable. For what it's worth, I think that a more modular approach to naming conventions are called for, or at least a way to get them out of my way at all.

Answer: Aside from the '+' and ',' conventions, tla inventory conventions are totally customizable. If you don't want core to be unrecognized, just edit your =tagging-method. If you have a set of customizations you use frequently, set them up as a package, and tag off them instead of importing.

Additional Comment: I recently looked at Zope3 which seems to have a large number of file names starting with two plus signs.

Constructing a report of changes since a revision

To do a release, I produce a report that shows the changes since the last release. Most patch-log commands operate on versions, not revisions. I don't always have a tag to represent a merge (either it isn't done yet, or I just want an interim report).

Question: How do I get a report like changelog, but only including information since revision x, up to revision y?

Answer: tla changes -v yourCategory--development--1.0--patch-2 will give part of the answer. It lists (patch report) the added/moved/changed/etc. files since ...patch-2, up to the current state of your project tree. It does not include all the information you can get from logs (e.g. the summary), does not let you pick an arbtrary y, and will reflect uncommitted changes.

Answer: Use tla logs with awk to get the summary (and combine with above): tla logs -s | awk '/^patch-3/,\!//{print}' | awk '/^[[:blank:]]/ {print}'. Note the patch-3 in there: replace with the first version you want info on. (what's a better range expression for awk that means to the end?)

Is the limit argument useful (there is no documentation on it)?

Does delta have a way to do this?

Better answers?

Information about the history of a file

Question: How do I get answers to these questions: Who and when has changed this file? Who has done this line in this file? What has happened to this file? Where does this file come from? In CVS there is commands like cvs log and cvs annotate that can give this kind of information.

Answer: The TlahistoryScript will tell you which patch affects a file, why, when, and by who. And, after writing it, I noticed tla-file-log in tla-tools (though I haven't looked at it).

I imagine a suitably clever script can take a patch (from tla changeset) and tell you which lines where changed by who. The TlahistoryScript could give you the patch&person as part of that solution.

See also AnnotateOrBlame for discussion on writing a native blame.

Server software?

Question: When arch uses FTP for storage without running a server, how can things like hook script work?(for sending emails after commits and such)

Answer: Hooks are run client-side only with Arch. For centralized development, this means all developpers have to set up their hook to be consistant. This may change in the future if someone implements a dedicated Arch server. If you need server-side hooks, you can also use a cron job on the server.

Merging file contents?

Question: How does one merge the contents of three files and also merge the revision history? If I'm converting a file-per-function code into a file-per-module one, I really want to know what the prior history was.

Answer: That's not supported. The smallest unit for revision history is the file, not a section of a file, so you can't say "this section of my merged file has this history, and this other section has that history".

Question2: Ok, but what about merging three whole files into one? Say I want to merge a.c and b.c into main.c en todo.

Answer2: That's not supported. You can do it, and it will work fine, but Arch won't understand. Specifically, if you later try to apply a changeset that modifies a.c, the modification will not be applied to main.c. Many programming languages provide ways to include one file into another programmatically, instead of combining the actual files, so that may be an option for you.

Symbolic, hard links?

Question: How does tla cope with links? Say I need or want two different files to always have the same contents, but they need to exist at two different locations in a tree. The normal thing to do would be to set up a symbolic or hard link between the files - but how do I tell tla that one is just a link to the other one? There is no tla ln or tla link command AFAICT.

Answer: just create the symbolic link and tla add it. tla will memorize the history of the destination of the link as it does for the content of files.

Synchronization with off-line computer

Question: I am usually working on two computers, A and B, of which A has net access but B does not. The only feasible way of transferring data between A and B is with floppy disks, which means that you really want to keep the data to transfer as low as possible. How can I handle that with tla? Ideally I would like to be able to work on one and the same project on both, but I need access to the committed versions at least. Is it sufficient to just carry the archives around, register the copied archive on the other computer, and tag from there?

Answer: I would seriously recommend something that stores a bit more data than a floppy disk, such as a USB key. That being said, I can't imagine why your scheme wouldn't work. Take a look at the docs on how to manage library-revisions to help with the restrictive space requirements you have on a floppy.

Partial Commit Bug

There is a bug in tla, which makes partial commits fail in some situations. For example:

> tla commit -- \{arch\}/\=tagging-method
make-changeset-files: file missing from ORIG tree ({arch}/=tagging-method)

Since this is a very basic operation, i was wondering how other users deal with this issue. I also hope that it will be fixed soon. --aku

Answer: The bug is that you have both modified and added files and are trying to commit only one. What I normally do if I find myself in that situation is

$ tla changes -o ../,,problemo

Pull out and apply the specific change I want manually, then reapply the changest I created with

$ tla do-patch ../,,problemo .

Sometimes tla complains and I need to manually look at .rej files but mostly it works.

Starting Tla Development

Q: If someone wants to contribute a fairly simple patch, how does he / she determine which tree to use as a base? Is there a listing of prioritized trees somewhere? Also, what exactly would a beginner fetch? Personally, I don't feel like understanding the full config system for a pretty simple change (fixing pointers passed through varargs on 64-bit machines).

Ok, let me refresh the question. If we have bug fixes to contribute, which tree should we use as a base? And where do the fixes go? Some are _simple_ patches, but I get the feeling people would rather use Arch than good old reliable email.

Signing and Shared Archive

Question: What is the best way to go about have a shared archive be signed and also to check signatures?

Translation

Question: Can arch be used for translation, like SVK does with smerge?

Post-Hoc ''Configurations'' (combining project trees)

Question: If I have two project trees, a and b, in a directory, how can I have arch treat them like one tree? I'd like to do a tla changes and see changes in both, etc. But, I don't want to put a in b, or vice-versa. CVS, for example, looks inside all the child directories when you invoke a command, so you get this behavior for free (even when you don't want it).

Configurations seem to be the way arch would like to do it (see above, Arch Recipes, and Tla Reference/build-config, for example). But, this supposes that you wish to start with tla get. What if I already have two trees? What are the steps to go from two separate trees to a a configuration?

Updating a checked out version

Question: This should be simple, but I've spent hours looking for the solution. I want to check out a specific revision of a project, let's say foo--mainline--5.0--patch-25. I try to do this with tla get machine@company.com /foo--mainline--5.0--patch-25 foo but it fails with a broken pipe error, which is a known bug (bug #7708). So I need a workaround. I am able to check out a previous version successfully, say foo--mainline--1.0--patch-56. Given that I have a working directory with foo--mainline--1.0--patch-56 in it, how do I update it to foo--mainline--5.0--patch-25? Note that foo--mainline--5.0 is a sub branch (descendant) of foo--mainline--1.0. I've tried delta, but it attempts to check out the foo--mainline--5.0--patch-25 revision, which again fails with the broken pipe error.

Answer: Did you try the latest version of tla (1.3 release candidate) to see if your bug was fixed ? Otherwise, use "tla replay" to add the missing patches to your working directory.

Easily, increase the version number

Question: In the tutorial, I found that project are named like this: name--branch--version. With Version number consisting of two digits "major.minor". But when I create a branch with base0 .. patch10, how can I increase the version number so that my patch11 becomes somewhat the base0 of "major.minor++". So I'd be able to tla get name--branch--version+1?

Answer: Finally, I found the info on a external mailing list. It should be in tutos, I think. If you have test--trunk--0.1--base-0, and you want to call it --0.2:

$ tla tag test--trunk--0.1--base-0 test--trunk--0.2
$ tla get test--trunk--0.2

Merging and patch logs

Question: Why, in tla, must there be a new patch added to my local tree-version branch when I merge in patches from another branch? The merged patches show up in the log history, so what's the point of the local patch? I guess this isn't so much a question about how things work now as it is a question about why things work the way they do.

Answer: If you don't have a new patch in your tree-version's branch, then the merge is not complete.

This is the way it works now. But does it have to? Why couldn't you just look at tla logs (assuming you had committed the merge patches already)?

Possible explanation (feel free to correct): Conceptually, a merge is just a special form of edit done to a tree. So that editing operation in itself needs to be logged (as part of the current branch). But the edit in itself consists in part of adding all the patch logs that came with the merge to the current tree as part of that edit, as well as the actual changes in the correspending patch changesets. So as far as merging is concerned, in a limited sense, patch logs are simply normal files in the tree. [Right?]

But then here's another question. Are the changsets for the merged patches from the remote branch stored in the local tree the same way as the patch logs, or are they discarded during the merge (since they're not part of the tree anyway)? Does the local tree's changset for the merge include all the changes from the merge patches combined into a single patch?

What if the programmer applies the patches from the remote tree and then manually undoes the changes from one of the patches before committing? Do future star-merge commands and similar just trust that the patch was actually applied correctly to the revision? I assume there would probably be conflicts if another patch touched the same area of source code, assuming that the previous patch had been applied.

For the patches to show up in log history, there must be a log containing them :)

Are you talking about the local merge patches, or the merged-in remote patches? Because whether or not there's a log for the merge changeset, the added logs from the remote tree should show up.

Wow. This is a lot to wrap one's head around.

get removed/changed file back

Question: Is it possible to get removed/locally changed file back?

Answer: Yes. See "Undoing changes to a single file" in Arch Recipes. Tools like Fai will also do it. --AaronBentley

'tla add' replaced by 'tla add-id'

Question: 'tla add' command does not work

Answer: As stated in the mailing list (http://lists.gnu.org/archive/html/gnu-arch-users/2005-05/msg00031.html), use the 'tla add-id' command instead of 'tla add' (tla-1.3.1 and over).

As of 1.3.4, tla add is restored and is now an alias for tla add-id.

ftp transport: no passive mode support

Question: The ftp server does not support passive mode, then doing 'tla archive-mirror xxx' results in a hang there. Any hints on how get rid of that ?

Partial commit in a dirty source tree

Question: Sometimes you make a small change to your source code and even to documentation and want to commit, but if, for example, source is built, commit will fail complaining about presence of files that violate naming conventions. Is there any workaround here but cleaning whole source tree?

Ask Arch questions (last edited 2006-10-23 11:35:40 by ACADC750)