Cross Reference: /pkg/doc/on-disk-format.txt
on-disk-format.txt revision 1968
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1968N/Apkg(5): image packaging system
1968N/A
1968N/AThis information is Copyright (c) 2010, Oracle and/or its affiliates.
1968N/AAll rights reserved.
1968N/A
1968N/AON-DISK FORMAT PROPOSAL
1968N/A
1968N/A1. Introduction
1968N/A 1.1. Date of This Document:
1968N/A
1968N/A 06/02/2010
1968N/A
1968N/A 1.2. Name of Document Author/Supplier:
1968N/A
1968N/A Shawn Walker, Oracle,
1968N/A on behalf of the pkg(5) project team
1968N/A
1968N/A 1.3. Acknowledgements:
1968N/A
1968N/A This document is largely based on comments from the following
1968N/A individuals to whom the author is exceedingly indebted to:
1968N/A
1968N/A - Danek Duvall
1968N/A - Mike Gerdts
1968N/A - Stephen Hahn
1968N/A - Krister Johansen
1968N/A - Dan Price
1968N/A - Brock Pytlik
1968N/A - Bart Smaalders
1968N/A - Peter Tribble
1968N/A
1968N/A2. Project Summary
1968N/A
1968N/A 2.1. Project Description:
1968N/A
1968N/A "...the repository can be archived up, put on a CD, memory
1968N/A stick, 2D barcode, and protected by the Black Knight, fire
1968N/A moats, komodo dragons, etc." - Danek Duvall
1968N/A
1968N/A pkg(5) is primarily a network-oriented binary packaging system.
1968N/A Although some of the tools it provides support filesystem-based
1968N/A operations for publication, the primary expected use for package
1968N/A operations (such as install, update, search, etc.) is between an
1968N/A intelligent client and one or more servers that provide access
1968N/A to a package repository and/or other interactive services.
1968N/A
1968N/A This project seeks to define and establish an on-disk format
1968N/A (and corresponding container format), for the pkg(5) system,
1968N/A with the intent that it can enable the ubiquitous, transparent
1968N/A use of package data from filesystem-based resources.
1968N/A
1968N/A The changes proposed by this project are evolutionary, not
1968N/A revolutionary, in nature. In particular, this project seeks
1968N/A to refine and adopt the existing repository format used by the
1968N/A pkg(5) depot server as the on-disk format. Supplementary to
1968N/A that, it also seeks the addition of a container format to ease
1968N/A provisioning of the on-disk format, and the unification of the
1968N/A scheme used by the client and server to store package data.
1968N/A
1968N/A 2.2. Problem Area:
1968N/A
1968N/A For some deployments, network-based package data access is not
1968N/A possible or is undesirable. Concerns often cited in this area
1968N/A include:
1968N/A
1968N/A - lack of access control or ability to easily integrate with
1968N/A existing access control systems,
1968N/A
1968N/A - inability to rely on alternative (or existing) provisioning
1968N/A arrangements (such as NFS-based file servers),
1968N/A
1968N/A - environmental or procedural requirements that prohibit the
1968N/A ability to or use of a network-based service,
1968N/A
1968N/A - characteristics of network protocols (such as HTTP, etc.) that
1968N/A artificially limit functionality or performance (as opposed to
1968N/A iSCSI or other alternatives),
1968N/A
1968N/A - ease of administration of filesystem-based resources, and
1968N/A
1968N/A - ease of transferring package data.
1968N/A
1968N/A3. Project Technical Description:
1968N/A 3.1. Details:
1968N/A
1968N/A This project defines an on-disk format (and corresponding con-
1968N/A tainer format) that is intended for the supplemental or complete
1968N/A provisioning of package data at all stages of the package life-
1968N/A cycle. That is, when package data is published, stored by the
1968N/A client or server, or otherwise used during package operations.
1968N/A
1968N/A The on-disk format (defined in detail later in this document)
1968N/A is intended to be distributable in its raw form (a pre-defined
1968N/A structure of directories and files) or within a container format
1968N/A (such as a zip file, etc.).
1968N/A
1968N/A Out of necessity, the use of filesystem-based resources (such as
1968N/A those provided by the on-disk format) will sometimes limit the
1968N/A operations that can be performed to a subset of those normally
1968N/A available when interacting with a network-based repository. For
1968N/A example, search and publisher configuration may not be possible,
1968N/A and purely interactive services such as the BUI (Browser UI)
1968N/A offered by the depot server for a repository, RSS feeds, and
1968N/A others will not be available.
1968N/A
1968N/A Because of the wide-ranging impact of the changes required to
1968N/A implement this functionality, it is intended that the project
1968N/A be implemented in the following sequence:
1968N/A
1968N/A - Client Support for filesystem-based Repository Access
1968N/A
1968N/A - Depot Storage, Client Transport and Publication Tool Update
1968N/A
1968N/A - Client Storage and Image Format Update
1968N/A
1968N/A - Client and Depot Support for On-Disk Archive Format
1968N/A
1968N/A 3.2. Bug/RFE Number(s):
1968N/A
1968N/A As an example of the kinds of defects and RFEs intended to be
1968N/A resolved by this project, see the following selection of
1968N/A defect.opensolaris.org bug IDs:
1968N/A
1968N/A2152 standalone package support needed (on-disk format)
1968N/A166 depot doesn't set directory mode when creating directories
1968N/A2086 validate that a repository is really a repository in pkg.depotd
1968N/A6335 publisher repo with invalid certificate information shouldn't
1968N/A prevent querying other repos
1968N/A6576 pkg install/image-update support for temporary publisher origins
1968N/A desired
1968N/A6940 depot support for file:// URI desired
1968N/A7213 ability to remove published packages
1968N/A7273 manifests should be arranged in a hierarchy by publisher
1968N/A7276 /var/pkg metadata needs reorg (looks busy)
1968N/A8433 client and pull need to refer to refer to "repository" instead of
1968N/A "server"
1968N/A8722 advanced repository metadata store needed
1968N/A8725 versioning information for depot and repository metadata needed
1968N/A9571 CachedManifest should be named FactoredManifest
1968N/A9572 CachedManifest should allow consumers to specify cache location
1968N/A9872 publication api should use new transport subsystem
1968N/A9933 ability to control repository creation behaviour or removal of it
1968N/A10244 caching dictionaries as a class variable prevents multi-image and
1968N/A repo search
1968N/A11362 Image update dying when trying to talk to a disabled and offline
1968N/A publisher
1968N/A11740 publishers with installed packages should not be removable
1968N/A12814 publisher prefixes should be forcibly lower-cased or case
1968N/A insensitive
1968N/A14802 ability to have separate read / write download caches
1968N/A15320 pkgsend will traceback if unable to parse server error response
1968N/A15371 repository property defaults opensolaris.org-specific
1968N/A
1968N/A 3.3. In Scope:
1968N/A
1968N/A Filesystem-based data resourcing for package operations.
1968N/A
1968N/A 3.4. Out of Scope:
1968N/A
1968N/A Package signing and fine-grained access control for package
1968N/A repositories.
1968N/A
1968N/A4. On-Disk Format Technical Description:
1968N/A 4.1. Overview:
1968N/A
1968N/A The on-disk format is intended to exist both in a raw format as
1968N/A a pre-defined structure of directories and files, and in an
1968N/A archive format which is primarily a simple container for
1968N/A the raw format.
1968N/A
1968N/A 4.2. Raw Format:
1968N/A
1968N/A 4.2.1. Goals:
1968N/A The goals for the raw on-disk format include:
1968N/A
1968N/A - unification of client and server package data storage
1968N/A for data common to both,
1968N/A
1968N/A - transparent usage of package data regardless of operation
1968N/A or use by client or server,
1968N/A
1968N/A - ease in composition and decomposition of package data
1968N/A stored within by publisher or package,
1968N/A
1968N/A - re-use of existing publication tools for on-disk format,
1968N/A
1968N/A - enablement of future publication tools to automatically
1968N/A be able to manipulate or use on-disk format, and
1968N/A
1968N/A - ease of provisioning.
1968N/A
1968N/A 4.2.2. Raw Format specification:
1968N/A
1968N/A The pkg(5) repository format is a set of directories and
1968N/A files that conform to a pre-defined structure.
1968N/A
1968N/A For a version 1 repository (the current format), the
1968N/A structure is as follows:
1968N/A
1968N/A <REPO_ROOT>/
1968N/A catalog/
1968N/A <catalog v1 files>
1968N/A index/
1968N/A <index files>
1968N/A file/
1968N/A <first two letters of file hash>/
1968N/A <file-named-by-hash>
1968N/A pkg/
1968N/A <stem>/
1968N/A <manifest-file>
1968N/A trans/
1968N/A <in-flight transaction files>
1968N/A cfg_cache (optional repository configuration file)
1968N/A
1968N/A Version 2 of the repository format eliminates the potential
1968N/A for unintended collisions between package metadata from
1968N/A different publishers and simplifies composition and decomp-
1968N/A osition of repository content. The top-level is an optional
1968N/A shared storage space for data common to all publishers in
1968N/A the repository, while the publisher subdirectory contains
1968N/A data specific to a publisher. It is essentially a nested
1968N/A repository format, and can be defined as follows:
1968N/A
1968N/A <REPO_ROOT>/
1968N/A file/ (optional)
1968N/A publisher/ (optional)
1968N/A <prefix>/ (optional)
1968N/A catalog/ (optional)
1968N/A <catalog v1 files>
1968N/A file/ (optional)
1968N/A <first two letters of file hash>/
1968N/A <file-named-by-hash>
1968N/A index/ (optional)
1968N/A pkg/ (optional)
1968N/A <stem>/
1968N/A <manifest-file-for-pkg-version>
1968N/A trans/ (optional)
1968N/A <in-flight transaction files>
1968N/A pub.p5i (optional)
1968N/A pkg5.repository (required)
1968N/A
1968N/A By default, repository operations will store data in the
1968N/A publisher-specific location found under publisher/<prefix>
1968N/A for new repositories.
1968N/A
1968N/A In the case that the top-level file/ directory is used,
1968N/A automatic decomposition of contents into its publisher-
1968N/A specific components will not be possible unless
1968N/A corresponding package manifests are also available.
1968N/A
1968N/A To support easy composition, filtering, and creation of
1968N/A package archives, directories above marked with the text
1968N/A '(optional)' must not be required. The behaviour of
1968N/A consumers accessing the contents of the repository should
1968N/A be as follows based on the directory accessed:
1968N/A
1968N/A - file/
1968N/A This optional directory serves as a place to store file
1968N/A data for more than one publisher. Package files are
1968N/A stored in gzip format using a sha1sum of the file as the
1968N/A filename, and then the first two letters of the filename
1968N/A as the parent directory's name.
1968N/A
1968N/A - publisher/<prefix>/catalog/
1968N/A If absent, consumers should determine the list of
1968N/A packages available based on the manifest files present
1968N/A in the publisher/ subdirectory. If present, consumers
1968N/A should expect v1 (or newer) catalog files, or none at
1968N/A all, to be contained within.
1968N/A
1968N/A - publisher/<prefix>/file/
1968N/A Consumers should always check this subdirectory first
1968N/A (if present) when retrieving package file data if the
1968N/A publisher is known. Package files are stored in gzip
1968N/A format using a sha1sum of the file as the filename, and
1968N/A then the first two letters of the filename as the parent
1968N/A directory's name.
1968N/A
1968N/A - publisher/<prefix>/index/
1968N/A If absent, search functionality should be disabled for
1968N/A this publisher, or a fallback to 'slow manifest-based
1968N/A search' performed. If present, consumers should expect
1968N/A v1 (or newer) search files, or none at all, to be con-
1968N/A tained within.
1968N/A
1968N/A - publisher/<prefix>/pkg/
1968N/A If absent, search must be disabled for this publisher
1968N/A even if index is present. If present, manifests are
1968N/A stored in pkg(5) manifest format using the uri-encoded
1968N/A version of the package FMRI as the filename, and using
1968N/A the uri-encoded package FMRI stem (name) as the parent
1968N/A directory's name.
1968N/A
1968N/A - publisher/<prefix>/trans/
1968N/A If absent, this directory will be created during
1968N/A publication operations. If present, in progress
1968N/A transaction data is stored in a directory named
1968N/A by the open time of the transaction as a UTC UNIX
1968N/A timestamp plus an '_' and the URI-encoded package
1968N/A FRMI. As an example:
1968N/A
1968N/A 1245176111_pkg%3A%2FBRCMbnx%400.5.11%2C5.11-0.116
1968N/A %3A20090616T181511Z
1968N/A
1968N/A - publisher/<prefix>/pub.p5i
1968N/A This pkg(5) information (p5i) file should contain
1968N/A suggested configuration information for clients such as
1968N/A origins, mirrors, alias, etc. Consumers can use this to
1968N/A provide clients with initial or suggested configuration
1968N/A information for a given publisher. If not present, the
1968N/A publisher's identity should be assumed based on the
1968N/A directory structure, while the refresh interval should
1968N/A be assumed to be 4 hours.
1968N/A
1968N/A - pkg5.repository
1968N/A This file serves as an identifier and a place to store
1968N/A configuration information specific to the repository.
1968N/A It *is not* an equivalent to the existing cfg_cache
1968N/A file which will no longer be used. Its format and
1968N/A structure are as follows:
1968N/A
1968N/A [repository]
1968N/A version = <integer>
1968N/A
1968N/A Any information found in the cfg_cache used in the previous
1968N/A repository format related to a publisher is now stored in
1968N/A the pub.p5i file for the related publisher. (Examples of
1968N/A information include origins, mirrors, maintainer info,
1968N/A etc.) As a result, the cfg_cache file is no longer used.
1968N/A
1968N/A Any depot-specific properties, such as the feed icon, logo,
1968N/A etc. are now completely managed using SMF or a user-provided
1968N/A configuration file. This change was made not only to sim-
1968N/A plify configuration, but to separate depot configuration
1968N/A from repsitory configuration.
1968N/A
1968N/A An example version 2 repository might be structured as
1968N/A follows:
1968N/A
1968N/A <REPO_ROOT>/
1968N/A publisher/
1968N/A example.com/
1968N/A catalog/
1968N/A catalog.attrs
1968N/A catalog.base.C
1968N/A file/
1968N/A ff/
1968N/A fffff277f5a8fb63e57670afc178415c2c5e706d
1968N/A index/
1968N/A __at_depend
1968N/A ...
1968N/A pkg/
1968N/A package%2Fpkg/
1968N/A 0.5.11%2C5.11-0.136%3A20100327T063139Z
1968N/A trans/
1968N/A 1245176111_pkg%3A%2FBRCMbnx%400.5.11%2C5.11-0.116
1968N/A %3A20090616T181511Z
1968N/A pub.p5i
1968N/A example.net/
1968N/A catalog/
1968N/A catalog.attrs
1968N/A catalog.base.C
1968N/A file/
1968N/A af/
1968N/A affff277f5a8fb63e57670afc178415c2c5e706d
1968N/A index/
1968N/A __at_depend
1968N/A ...
1968N/A pkg/
1968N/A package%2Fpkg/
1968N/A 0.5.11%2C5.11-0.133%3A20090327T062137Z
1968N/A trans/
1968N/A 1245176111_pkg%3A%2FFAAMbnx%400.5.11%2C5.11-0.139
1968N/A %3A20100616T181511Z
1968N/A pub.p5i
1968N/A
1968N/A pkg5.repository:
1968N/A [repository]
1968N/A version = 2
1968N/A
1968N/A 4.3. Archive Format:
1968N/A
1968N/A 4.3.1. Requirements:
1968N/A
1968N/A The requirements for the on-disk archive format include:
1968N/A
1968N/A - support for archives greater than 8GB in size,
1968N/A
1968N/A - support for files in archive greater than 4GB in size,
1968N/A
1968N/A - support for efficient storage of hard links,
1968N/A
1968N/A - support for pathnames sigificantly greater than > 255
1968N/A characters in length,
1968N/A
1968N/A - core Python bindings exist or can be easily created using
1968N/A an existing library,
1968N/A
1968N/A - can be a container of compressed files, as opposed to a
1968N/A compressed container of uncompressed files,
1968N/A
1968N/A - open, royalty-free, well-documented format with wide
1968N/A platform support and acceptance,
1968N/A
1968N/A - multi-threaded decompression and compression possible,
1968N/A
1968N/A - creation and basic manipulation of package archives
1968N/A possible using widely-available tools,
1968N/A
1968N/A - simple composition and filtering of its content should be
1968N/A possible, and
1968N/A
1968N/A - random access to the archive contents must be possible
1968N/A without reading the entire archive file.
1968N/A
1968N/A 4.3.2. Candidates:
1968N/A
1968N/A A number of potential archive formats have been considered
1968N/A for use, including:
1968N/A
1968N/A - 7z (7-Zip)
1968N/A - cpio
1968N/A - pax (portable archive exchange format)
1968N/A - ZIP
1968N/A
1968N/A The evaluations provided for each format here are not in-
1968N/A tended to be exhaustive; rather they focus on the specific
1968N/A requirements of this project. For more information about
1968N/A these formats, and the documents used to evaluate them,
1968N/A please refer to section 6 of this proposal.
1968N/A
1968N/A 4.3.3. 7z Evaluation:
1968N/A
1968N/A The 7z format was rejected for the following reasons:
1968N/A
1968N/A - Does not permit random access to archive contents or
1968N/A requires the entire archive file to access the contents
1968N/A and adding this would require a custom variation of 7z.
1968N/A
1968N/A - Although the 7z format supports compression methods other
1968N/A than LZMA, a primary motivator for using 7z would be the
1968N/A ability to use LZMA natively as part of the conatiner
1968N/A format. However, the tradeoffs in terms of CPU and memory
1968N/A footprint currently make LZMA unsuitable for pkg(5) when
1968N/A compared to other compression algorithms such as those
1968N/A used by gzip(1).
1968N/A
1968N/A - Use of the 7z format would require integration of the LZMA
1968N/A SDK (which also provides a basic 7z API in C) and the cre-
1968N/A ation of python bindings or the integration of a third
1968N/A party's (such as pylzma).
1968N/A
1968N/A - No native support for extended attributes or UNIX owner/
1968N/A group permissions.
1968N/A
1968N/A 4.3.4. cpio Evaluation:
1968N/A
1968N/A The cpio format doesn't natively support random access to
1968N/A archive contents, but the format itself doesn't prevent
1968N/A this. An index could be added first file in the archive
1968N/A with the information needed to provide fast, random access
1968N/A to the archive contents.
1968N/A
1968N/A The cpio format was rejected for the following reasons:
1968N/A
1968N/A - The length of pathnames in cpio archives is limited to
1968N/A 256 characters for the portable format.
1968N/A
1968N/A - Available tools vary significantly in maximum archive size
1968N/A support.
1968N/A
1968N/A - The portable cpio format stores a copy of the file data
1968N/A with every hard link in an archive instead of simply
1968N/A storing a pointer to the source file in the archive.
1968N/A
1968N/A 4.3.4. PAX Evaluation:
1968N/A
1968N/A The PAX format meets all of the requirements except that of
1968N/A random access to archive contents. However, the format
1968N/A itself doesn't prevent this. A table of contents file could
1968N/A be supplied as the first file in the archive with the info-
1968N/A rmation needed to provide fast, random access to the con-
1968N/A tainer contents.
1968N/A
1968N/A 4.3.5. ZIP Evaluation:
1968N/A
1968N/A The ZIP format meets all of the requirements listed above
1968N/A (assuming that ZIP64 extensions are used), with the ex-
1968N/A ceptions listed below for which it was rejected:
1968N/A
1968N/A - The use or implementation of some of the functionality
1968N/A documented in the .ZIP file format requires a license from
1968N/A PKWARE.
1968N/A
1968N/A - While random archive content access is possible, the ZIP
1968N/A file format stores the index for the archive at the end of
1968N/A the archive (as opposed to the beginning). This increases
1968N/A the number of round trips that would be required for
1968N/A potential remote random content access. It also means
1968N/A that extraction requires multiple seeks to the end of the
1968N/A file before any content can be extracted from the archive,
1968N/A which can be detrimental to performance for some media
1968N/A types (optical, etc.).
1968N/A
1968N/A 4.3.6. Evaluation Conclusion:
1968N/A
1968N/A Based on the requirements set forth in section 4.3.1, the
1968N/A PAX format was selected as the on-disk archive format
1968N/A for pkg(5) packages. However, to enable efficient access
1968N/A to the archive contents, an index file needs to be present
1968N/A as the first file in the archive.
1968N/A
1968N/A Early evaluations of an unoptimised prototype were performed
1968N/A using a repository containing all packages for build 136 and
1968N/A unbundleds. The on-disk size of the repository was appox-
1968N/A imately 4.98G. The resulting archive was 5.0G in size, with
1968N/A an archive index file 9.7M in size (when the index was comp-
1968N/A ressed using gzip).
1968N/A
1968N/A First time access to the prototype archive for extraction of
1968N/A a single file after creation yielded a total time of approx-
1968N/A imately 5 seconds compared to approximately 36-42 seconds
1968N/A for utilities such as pax(1), tar(1), or gtar(1).
1968N/A
1968N/A Creation of the archive took 7 minutes, 35 seconds on a
1968N/A custom-built Intel Core 2 DUO E8400, with 8GB Memory,
1968N/A and a 1TB 7200 RPM SATA Drive w/ 64MB Cache.
1968N/A
1968N/A 4.3.7. Package Archive Specification:
1968N/A
1968N/A pkg(5) archive files will have an extension of 'p5p' which
1968N/A will stand for 'pkg(5) package'. The format of these
1968N/A archives matches that defined by IEEE Std 1003.1, 2004 for
1968N/A the pax Interchange Format, with the exception that the
1968N/A first archive entry must not use the optional pax headers
1968N/A allowed by the format, and must contain the index file
1968N/A for the package archive. The layout can be visualised as
1968N/A follows:
1968N/A
1968N/A .--------------------------------------------------------.
1968N/A | ustar header for package archive index file |
1968N/A .--------------------------------------------------------.
1968N/A | file data for package archive index file |
1968N/A .--------------------------------------------------------.
1968N/A | remaining archive data |
1968N/A .________________________________________________________.
1968N/A
1968N/A The reason for this limitation is to ensure that clients
1968N/A performing selective archive extraction can be guaranteed
1968N/A to find the location and size of the package archive index
1968N/A file without knowing the size of the header for the index
1968N/A file in advance (ustar headers are always 512 bytes in
1968N/A size).
1968N/A
1968N/A In addition, pkg(5) archives in this format make remote,
1968N/A selective archive access possible. For example, a client
1968N/A could request the first 512 bytes of a pkg(5) archive file
1968N/A from a remote repository, then retrieve the archive index
1968N/A file. Once it has the archive index file, it can then
1968N/A perform a HTTP/1.1 byte-ranges request to selectively
1968N/A transfer the data for a set of specific files from the
1968N/A archive. This convention also optimises access to the
1968N/A archive for sources that are heavily biased towards
1968N/A sequential reads.
1968N/A
1968N/A The index file must be named using the following template
1968N/A and be compressed using the gzip format described by RFCs
1968N/A 1951 and 1952, and formatted according to section 4.3.8:
1968N/A
1968N/A p5p.index.<index_file_number>.<index_version>.gz
1968N/A
1968N/A <index_file_number> is an integer in string form that
1968N/A indicates which index file this is. The number only
1968N/A exists so that each index file can remain unique in
1968N/A the archive. An archive may contain multiple index
1968N/A files to support fast archive additions.
1968N/A
1968N/A <index_version> is an integer in string form that
1968N/A indicates the version of the index file. The initial
1968N/A version for this proposal will be '0'.
1968N/A
1968N/A If the first file in the archive is found not to be in the
1968N/A layout or format shown above, or any of the index files in
1968N/A the archive are found to not be in a format supported by
1968N/A the client (version too old or too new), the archive will
1968N/A be treated as a standard pax archive and some operations
1968N/A may not be possible or experience degraded performance.
1968N/A The same is also true if the index file is found to not
1968N/A match the archive contents.
1968N/A
1968N/A When creating the archive, or adding to an existing archive,
1968N/A new index gzip files should be zero-padded with an extra 256
1968N/A bytes at the end. This reserved space is used for fast
1968N/A additions to existing package archives by updating the
1968N/A previous index file with an entry for the new index file.
1968N/A For example, the first index file's last entry should
1968N/A contain the name and offset of the second index file,
1968N/A and so on.
1968N/A
1968N/A All pathnames after the first in the archive (if the first
1968N/A file is the archive index file) must conform to the repo-
1968N/A sitory layout specified in section 4.2.2 of this proposal.
1968N/A
1968N/A Since a pkg(5) repository can contain one or more packages,
1968N/A pkg(5) archive files can also contain the data for one or
1968N/A more packages. This allows easy redistribution of a single
1968N/A package and all of its dependencies in a single file.
1968N/A
1968N/A Finally, it should be noted that only ascii character path-
1968N/A names are expected in the archive as the raw repository
1968N/A format does not use or support unicode pathnames.
1968N/A
1968N/A 4.3.8. Package Archive Index Specification:
1968N/A
1968N/A The pkg(5) archive index file enables fast, efficient access
1968N/A to the contents of an archive. It contains an entry for all
1968N/A files in the archive excluding the index file itself in the
1968N/A following format (also referred to as index format version
1968N/A 0):
1968N/A
1968N/A <name>NUL<offset>NUL<entry_size>NUL<size>NUL<typeflag>NL
1968N/A
1968N/A <name> is a string containing the pathname of the file
1968N/A in the archive using only ascii characters. It can be
1968N/A up to 65,535 bytes in length.
1968N/A
1968N/A <offset> is an unsigned long long integer in string form
1968N/A containing the relative offset in bytes of the first
1968N/A header block for the file in the archive. The offset is
1968N/A relative to the end of the last block of the index file
1968N/A in the archive they are listed in.
1968N/A
1968N/A <entry_size> is an unsigned long long integer in string
1968N/A form containing the size of the file's entry in bytes
1968N/A in the archive (including archive headers and trailers
1968N/A for the entry).
1968N/A
1968N/A <size> is an unsigned long long integer in string form
1968N/A containing the size of the file in bytes in the archive.
1968N/A
1968N/A <typeflag> is a single character representing the type
1968N/A of the file in the archive. Possible values are:
1968N/A 0 Regular File
1968N/A 1 Hard Link
1968N/A 2 Symbolic Link
1968N/A 5 Directory or subdirectory
1968N/A
1968N/A All values not listed above are reserved for future
1968N/A use. Unrecognised values should be treated as a
1968N/A regular file.
1968N/A
1968N/A An example set of entries would appear as follows:
1968N/A
1968N/A pkg5.repositoryNUL0NUL546NUL2560NUL0
1968N/A pkgNUL2560NUL0NUL1536NUL5
1968N/A pkg/service%2Ffault-managementNUL4096NUL0NUL1536NUL5
1968N/A
1968N/A It should be noted that other possible formats were
1968N/A evaluated for the index file, including those based
1968N/A on: JSON, XDR, and python's pack. However, all other
1968N/A formats were found to be deficient for one or more
1968N/A of the following reasons:
1968N/A
1968N/A - larger in size
1968N/A
1968N/A - no streaming support (required entire index file be
1968N/A loaded into memory)
1968N/A
1968N/A - significantly greater parsing times
1968N/A
1968N/A5. Proposed Changes:
1968N/A
1968N/A 5.1. Client Support for filesystem-based Repository Access:
1968N/A
1968N/A The pkg.client.api provided by pkg(5) will be updated to allow
1968N/A access to repositories via the filesystem. All functionality
1968N/A normally offered by pkg.depotd will be supported.
1968N/A
1968N/A pkg(1) and packagemanager(1) will be modified to support the
1968N/A use of URIs using the 'file' scheme. No user visible changes
1968N/A will be made to any existing subcommands or options except
1968N/A that URIs using the 'file' scheme will be allowed.
1968N/A
1968N/A When accessing repositories using the 'file' scheme, clients
1968N/A by default will not copy package file data into the client's
1968N/A cache (e.g. /var/pkg/download). Instead, the transport system
1968N/A will treat configured repositories as an additional read-only
1968N/A cache.
1968N/A
1968N/A 5.2. Depot Storage, Client Transport and Publication Tool Update:
1968N/A
1968N/A The pkg.server.repository module will be updated to support
1968N/A the new repository format outlined in section 4.2.2. Existing
1968N/A repositories will not automatically be upgraded, while new
1968N/A repositories will use the new format. A new administrative
1968N/A command detailed below has been introduced to allow upgrading
1968N/A existing repositories to the new format.
1968N/A
1968N/A These changes will automatically allow the client to access
1968N/A repositories in the new format when using filesystem-based
1968N/A access. Older clients will remain unable to access repo-
1968N/A sitories in the new format.
1968N/A
1968N/A The client transport system will be updated to support all
1968N/A publication operations and the publication tools and project
1968N/A private APIs will be changed to use the client transport
1968N/A system.
1968N/A
1968N/A The '-d' option of pkgrecv(1) will be changed such that if
1968N/A the name of a file with a '.p5p' extension is specified,
1968N/A and that file does not already exist, a pkg(5) archive
1968N/A file will be created containing the specified packages.
1968N/A If the file already exists, it will exit with an error.
1968N/A When pkgrecv(1) creates pkg(5) archive files, it will omit
1968N/A catalog and index data.
1968N/A
1968N/A Due to the transport changes above, pkgrecv(1) will also
1968N/A be able to use pkg(5) archive files as a source of package
1968N/A data. pkgsend(1) will not support the use of pkg(5)
1968N/A archive files as a destination due to the publication
1968N/A model it currently uses.
1968N/A
1968N/A To support the expanded multiple publisher version 2 format
1968N/A of repositories, the depot server will be updated to respond
1968N/A to requests as follows:
1968N/A
1968N/A - If clients include the publisher prefix as part of the request
1968N/A path, then responses will be for that specific publisher's
1968N/A data. For example:
1968N/A
1968N/A http://localhost/dev/opensolaris.org/manifest/
1968N/A 0/opensolaris.org/backup%2Fareca/7.1%2C5.11-0.134
1968N/A %3A20100302T005731Z
1968N/A
1968N/A http://localhost/dev/file/0/opensolaris.org/
1968N/A 2ce6c746c85cd7ac44571d094b53c5fe1bfc32c8
1968N/A
1968N/A - The default publisher specified in the depot configuration
1968N/A will be used when responding to requests for operations that
1968N/A do not include the publisher prefix. For example:
1968N/A
1968N/A http://localhost/dev/manifest/0/
1968N/A backup%2Fareca/7.1%2C5.11-0.134%3A20100302T005731Z
1968N/A
1968N/A ...provides a response identical to the first case where the
1968N/A publisher prefix was provided as part of the request. Those
1968N/A expecting to maintain a large population of older clients
1968N/A should reassign publisher URLs down a level, to include the
1968N/A publisher explicitly although this is not required for
1968N/A correct operation.
1968N/A
1968N/A A new utility named pkgrepo will be added to facilitate the
1968N/A creation and management of pkg(5) repositories. It will have
1968N/A the following global options:
1968N/A
1968N/A -s repo_uri_or_path
1968N/A A URI or path specifying the location of a pkg(5)
1968N/A package repository.
1968N/A
1968N/A -? / --help
1968N/A
1968N/A It will have the following subcommands:
1968N/A
1968N/A create <uri_or_path>
1968N/A Creates a pkg(5) repository at the specified location.
1968N/A Can only be used with filesystem-based repositories.
1968N/A
1968N/A publisher [<pub_prefix> ...]
1968N/A Lists the publishers of packages in the repository:
1968N/A
1968N/A PUBLISHER PACKAGES VERSIONS UPDATED
1968N/A <pub_1> <num_uniq_pkgs> <num_pkg_vers> <cat_last_modified>
1968N/A <pub_2> <num_uniq_pkgs> <num_pkg_vers> <cat_last_modified>
1968N/A ...
1968N/A
1968N/A rebuild
1968N/A Discards any catalog, search or other cached informaqtion
1968N/A found in the repository and then re-creates it based on
1968N/A the current contents of the repository. Can only be used
1968N/A with filesystem-based repositories.
1968N/A
1968N/A refresh
1968N/A By default, catalogs any new packages found in the repo-
1968N/A sitory and updates search indices. This is intended for
1968N/A use with deferred publication (--no-catalog or --no-index
1968N/A options of pkgsend). Can only be used with filesystem-based
1968N/A repositories.
1968N/A
1968N/A Options:
1968N/A --no-catalog - doesn't add new packages
1968N/A --no-index - doesn't refresh search indices
1968N/A
1968N/A remove fmri_pattern ...
1968N/A Removes the specified package(s) from the repository.
1968N/A If more than one match is found for any given pattern,
1968N/A the exact FMRI must be provided.
1968N/A
1968N/A upgrade
1968N/A Can only be used with filesystem-based repositories.
1968N/A Upgrades the repository to the most current format if
1968N/A possible.
1968N/A
1968N/A Has these options:
1968N/A
1968N/A -n determine whether the upgrade could be formed and exit
1968N/A
1968N/A -v show a summary of what will be done, the current format
1968N/A of the repository and what it will be upgraded to
1968N/A
1968N/A 5.3. Client Storage and Image Format Update:
1968N/A
1968N/A To simplify and unify the storage format used by the client,
1968N/A and pkg(5) repositories, the format of the client image
1968N/A will be changed to use the structure described below.
1968N/A
1968N/A For a version 4 image (the proposed format), the structure is
1968N/A as follows:
1968N/A
1968N/A <IMG_ROOT>
1968N/A cache/
1968N/A <publisher_prefix>/
1968N/A certs/
1968N/A <publisher signing certificates>
1968N/A pkg/
1968N/A <stem>/
1968N/A <manifest-cache-files>
1968N/A publisher/
1968N/A <as described in section 4.2.2>
1968N/A index/
1968N/A <client search index files>
1968N/A ssl/
1968N/A client ssl certificates>
1968N/A history/
1968N/A <client history files>
1968N/A pm_cache/
1968N/A <package manager data files>
1968N/A state/
1968N/A installed/
1968N/A <image catalog files>
1968N/A known/
1968N/A <image catalog files>
1968N/A pkg5.image (client configuration file; was cfg_cache)
1968N/A
1968N/A A new property named 'version' will be added to the image
1968N/A and will be readonly (cannot be set using the set-property
1968N/A subcommand of pkg(1)).
1968N/A
1968N/A Existing images will not automatically be upgraded to the new
1968N/A format. To enable the upgrading of existing images to newer
1968N/A formats, the following subcommands will be added:
1968N/A
1968N/A meta-update
1968N/A Updates the format of the client's image to the current
1968N/A format if possible.
1968N/A
1968N/A Has these options:
1968N/A
1968N/A -n determine whether the upgrade could be formed and exit
1968N/A
1968N/A -v show a summary of what will be done, the current format
1968N/A of the repository and what it will be upgraded to
1968N/A
1968N/A 5.4. Client and Depot Support for On-Disk Archive Format:
1968N/A
1968N/A The pkg.server.repository module will be updated to support
1968N/A the serving of a repository in readonly mode using a pkg(5)
1968N/A archive file.
1968N/A
1968N/A The pkg.client.api transport system will be updated to support
1968N/A the usage of a pkg(5) archive file as an origin for package
1968N/A data.
1968N/A
1968N/A To support the specification of temporary origins, the install
1968N/A and image-update subcommands will be modified by adding a '-g'
1968N/A option to specify additional temporary package origin URIs or
1968N/A the path to a pkg(5) archive file or pkg(5) info file. The
1968N/A '-g' option may be specified multiple times. As an example:
1968N/A
1968N/A $ pkg install -g /path/to/foo.p5p \
1968N/A -g http://mytemprepo:10000/ \
1968N/A -g file:/path/to/bar.p5p \
1968N/A foo bar localpkg
1968N/A
1968N/A $ pkg image-update -g /path/to/foo.p5p
1968N/A
1968N/A pkg(5) archive files used as a source of package data during an
1968N/A install or image-update operation will have their content cached
1968N/A by the client before the operation begins. Any publishers found
1968N/A in the archive will be temporarily added to the image if they do
1968N/A not already exist. Publishers that were temporarily added but
1968N/A not used during the operation will be removed after operation
1968N/A completion or failure. Any package FMRIs or patterns provided
1968N/A will be matched using only the sources provided using '-g'.
1968N/A
1968N/A The pkg list and pkg info commands will also be updated by
1968N/A adding the '-g' option described above, with the exception
1968N/A that the '-g' option may only be specified once, and only
1968N/A the source named will be used for the operation.
1968N/A
1968N/A Using '-g' with the pkg list subcommand implies '-n' by default,
1968N/A unless '-f' is specified; it also implies '-a'. To list all
1968N/A versions, the '-f' option must be used. As an example:
1968N/A
1968N/A $ pkg list -g /path/to/foo.p5p
1968N/A NAME (PUBLISHER) VERSION STATE UFOXI
1968N/A bar (example.com) 1.0-0.133 known -----
1968N/A foo (example.com) 1.0-0.133 installed -----
1968N/A
1968N/A $ pkg list -g file:/path/to/foo.p5p
1968N/A NAME (PUBLISHER) VERSION STATE UFOXI
1968N/A bar (example.com) 1.0-0.133 known -----
1968N/A foo (example.com) 1.0-0.133 installed -----
1968N/A
1968N/A $ pkg list -f -g http://example.com/multi_foo.p5p
1968N/A NAME (PUBLISHER) VERSION STATE UFOXI
1968N/A foo (example.com) 1.0-0.133 installed u----
1968N/A foo (example.com) 2.0-0.133 known u----
1968N/A foo (example.com) 3.0-0.133 known -----
1968N/A
1968N/A $ pkg list -g file:/path/to/repo
1968N/A NAME (PUBLISHER) VERSION STATE UFOXI
1968N/A repopkg (example.com) 2.0-0.133 known -----
1968N/A
1968N/A $ pkg list -g http://myrepo:10000
1968N/A NAME (PUBLISHER) VERSION STATE UFOXI
1968N/A localpkg (example.org) 3.0-0.133 known -----
1968N/A
1968N/A Using '-g' with the pkg info subcommand implies '-r'. The '-l'
1968N/A option cannot be used in combination with '-g'. As an example:
1968N/A
1968N/A $ pkg info -g /path/to/bundle.p5p
1968N/A Name: bar
1968N/A Summary: A useful complement to foo.
1968N/A State: Not Installed
1968N/A ...
1968N/A Name: foo
1968N/A Summary: Provides useful utilities.
1968N/A State: Installed
1968N/A ...
1968N/A
1968N/A '-g' was chosen for the option usage described above to match
1968N/A the '-g' already used by set-publisher and image-create for
1968N/A origins, and due to the unfortunate existing usage of '-s'
1968N/A by the 'pkg list' subcommand.
1968N/A
1968N/A6. Reference Documents:
1968N/A
1968N/A Project team members and community members have provided a number of
1968N/A informal comments that served as the basis for the goals of this
1968N/A project:
1968N/A
1968N/A - "new on-disk format?", 18 Jan. 2008:
1968N/A http://markmail.org/thread/2kg6w5bfwp4x3knc
1968N/A
1968N/A - "reorganising the repository and client metadata", 23. Sep. 2009:
1968N/A http://markmail.org/thread/stfrosvx3v6if2fi
1968N/A
1968N/A - "ZAP - Zip Archive Packaging", Sep. 2007:
1968N/A http://markmail.org/thread/ijyq3mlrhaofccgx
1968N/A
1968N/A In addition, the following materials were referenced when writing
1968N/A this proposal:
1968N/A
1968N/A - "7z", 12 Apr. 2010:
1968N/A http://en.wikipedia.org/wiki/7z
1968N/A
1968N/A - "RFC2616: HTTP/1.1 Header Field Definitions", 01 Sep. 2004:
1968N/A http://www.w3.org/Protocols/rfc2616/
1968N/A rfc2616-sec14.html#sec14.35.1
1968N/A
1968N/A - "cpio", 21 Mar. 2010:
1968N/A http://en.wikipedia.org/wiki/Cpio
1968N/A
1968N/A - "copy file archives in and out", 26 Mar. 2007:
1968N/A http://heirloom.sourceforge.net/man/cpio.1.html
1968N/A
1968N/A - "The gzip file format", Date Unknown:
1968N/A http://www.gzip.org/format.txt
1968N/A
1968N/A - "DragonFly File Formats Manual, cpio -- format of cpio archive
1968N/A files"
1968N/A http://leaf.dragonflybsd.org/cgi/web-man?command=cpio&section=5
1968N/A
1968N/A - "A Quick Benchmark: Gzip vs. Bzip2 vs. LZMA", 31 May. 2005:
1968N/A http://tukaani.org/lzma/benchmarks.html
1968N/A
1968N/A - "Lempel Ziv Markov Algorithm and 7-Zip", 7 Feb. 2008:
1968N/A http://blogs.sun.com/clayb/entry/lempel_ziv_markov_algorithm_and
1968N/A
1968N/A - "The Open Group Base Specifications Issue 6: pax Interchange
1968N/A Format, IEEE Std 1003.1, 2004 Edition"
1968N/A http://www.opengroup.org/onlinepubs/009695399/utilities/
1968N/A pax.html#tag_04_100_13_01
1968N/A
1968N/A - ".ZIP File Format Specification", 28 Sep. 2007:
1968N/A http://www.pkware.com/documents/casestudies/APPNOTE.TXT
1968N/A
1968N/A - "ZIP (file format)", 17 Apr. 2010:
1968N/A http://en.wikipedia.org/wiki/ZIP_%28file_format%29