metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:46338
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
base_model: Alibaba-NLP/gte-modernbert-base
widget:
- source_sentence: >-
What are the specific points and subparagraphs mentioned in the context of
Article 4(3) that relate to the introductory wording and how do they
connect to the provisions outlined in Article 3(1)?
sentences:
- >-
51 - Article 2, points 52, 53,54, 55 and 56 - Article 3 - Article 4(1)
Article 3(1), first subparagraph Article 4(2), first subparagraph
Article 4(2), second subparagraph Article 3(1), second subparagraph,
introductory wording Article 4(3), first subparagraph, introductory
wording Article 3(1), second subparagraph, points (a) and (b) Article
4(3), first subparagraph, points (a) and (b) Article 3(1), second
subparagraph, point (c) - Article 3(1), second subparagraph, point (d)
Article 4(3), first subparagraph, point (c) Article 3(1), third
subparagraph, introductory wording - - Article 4(3), first subparagraph,
point (d), introductory wording - Article 4(3), first subparagraph,
points (d)(i), (ii) and (iii) Article 3(1), third subparagraph, point
(a) Article 4(3), first subparagraph, point (d)(iv) - Article 4(3),
first subparagraph, point (e), introductory wording Article 3(1), third
subparagraph, point (b) Article 4(3), first subparagraph, point (e)(i)
Article 3(1), third subparagraph, point (c) Article 4(3), first
subparagraph, point (e)(ii) Article 3(1), third subparagraph, point (d)
Article 4(3), first subparagraph, point (e)(iii) Article 3(1), third
subparagraph, point (e) - - Article 4(3), first subparagraph, point
(e)(iv) Article 3(2) and (3) - Article 3(4) Article 35(6) Article 3(5)
and (6) - - Article 4(4) - Article 4(5) Article 4(6) Article 4(7) -
Article 5 Article 5(1), first subparagraph Article 6(1), first
subparagraph Article 5(1), second subparagraph Article 6(1), fifth
subparagraph - Article 6(1), second and third subparagraph Article 5(1),
third subparagraph Article 6(1), fourth subparagraph Article 5(1),
fourth and fifth subparagraph - Article 5(2) - Article 6(2) Article
6(2), second subparagraph Article 5(3) Article 6(3) Article 5(4) Article
6(4) Article 5(5) Article 6(5) Article 5(5), first subparagraph, point
(b) Article 6(5), second subparagraph, point (c) - Article 6(5), second
subparagraph, point (b) Article 5(6) Article 6(6) - Article 6(6), second
subparagraph, point (a) Article 5(6), second subparagraph Article 6(6),
second subparagraph, point (b) Article 5(6), third subparagraph Article
6(6), third subparagraph Article 5(7) - Article 6(1), first subparagraph
Article 7(1), first
- >-
ii.
measures to protect against retaliation its own workers who are
whistleblowers in accordance with the applicable law transposing
Directive (EU) 2019/1937 of the European Parliament and of the Council (
121 );
(d)
where the undertaking has no policies on the protection of
whistle-blowers ( 122 ), it shall state this and whether it has plans to
implement them and the timetable for implementation;
(e)
beyond the procedures to follow-up on reports by whistleblowers in
accordance with the applicable law transposing Directive (EU) 2019/1937,
whether the undertaking has procedures to investigate business conduct
incidents , including incidents of corruption and bribery , promptly,
independently and objectively;
(f)
where applicable, whether the undertaking has in place policies with
respect to animal welfare;
(g)
the undertaking’s policy for training within the organisation on
business conduct, including target audience, frequency and depth of
coverage; and
(h)
the functions within the undertaking that are most at risk in respect of
corruption and bribery .
Undertakings that are subject to legal requirements under national law
transposing Directive (EU) 2019/1937, or to equivalent legal
requirements with regard to the protection of whistle-blowers, may
comply with the disclosure specified in paragraph 10 (d) by stating that
they are subject to those legal requirements.
Disclosure Requirement G1-2 – Management of relationships with suppliers
The undertaking shall provide information about the management of its
relationships with its suppliers and its impacts on its supply chain.
The objective of this Disclosure Requirement is to provide an
understanding of the undertaking’s management of its procurement process
including fair behaviour with suppliers .
The undertaking shall provide a description of its policy to prevent
late payments, specifically to SMEs.
The disclosure required under paragraph 12 shall include the following
information:
(a)
the undertaking’s approach to its relationships with its suppliers ,
taking account of risks to the undertaking related to its supply chain
and of impacts on sustainability matters ; and
(b)
whether and how it takes into account social and environmental criteria
for the selection of its suppliers.
Disclosure Requirement G1-3 – Prevention and detection of corruption and
bribery
The undertaking shall provide information about its system to prevent
and detect, investigate, and respond to allegations or incidents
relating to corruption and bribery including the related training.
The objective of this Disclosure Requirement is to provide transparency
on the key procedures of the undertaking to prevent, detect, and address
allegations about corruption and bribery . This includes the training
provided to own workers and/or information provided internally or to
suppliers .
The disclosure required under paragraph 16 shall include the following
information:
(a)
a description of the procedures in place to prevent, detect, and address
allegations or incidents of corruption and bribery ;
(b)
whether the investigators or investigating committee are separate from
the chain of management involved in the matter; and
(c)
the process, if any, to report outcomes to the administrative,
management and supervisory bodies .
Where the undertaking has no such procedures in place, it shall disclose
this fact and, where applicable, its plans to adopt them.
The disclosures required by paragraph 16 shall include information about
how the undertaking communicates its policies to those for whom they are
relevant to ensure that the policy is accessible and that they
understand its implications.
The disclosure required by paragraph 16 shall include information about
the following with respect to training:
(a)
the nature, scope and depth of anti- corruption and anti- bribery
training programmes offered or required by the undertaking;
(b)
the percentage of functions-at-risk covered by training programmes; and
(c)
the extent to which training is given to members of the administrative,
management and supervisory bodies.
Metrics and targets
Disclosure Requirement G1-4 – Incidents of corruption or bribery
The undertaking shall provide information on incidents of corruption or
bribery during the reporting period.
- >-
(39)
‘algorithmic trading’ means trading in financial instruments where a
computer algorithm automatically determines individual parameters of
orders such as whether to initiate the order, the timing, price or
quantity of the order or how to manage the order after its submission,
with limited or no human intervention, and does not include any system
that is only used for the purpose of routing orders to one or more
trading venues or for the processing of orders involving no
determination of any trading parameters or for the confirmation of
orders or the post-trade processing of executed transactions;
(40)
‘high-frequency algorithmic trading technique’ means an algorithmic
trading technique characterised by:
(a)
- source_sentence: >-
What action does the Commission take if the scheme owner fails to address
the deficiencies and the scheme no longer meets the criteria in Annex IV?
sentences:
- >-
2.
Implementing partners shall fill out the Scoreboard for their proposals
for financing and investment operations.
3.
The Scoreboard shall cover the following elements:
(a)
a description of the proposed financing or investment operation;
(b)
how the proposed financing or investment operation contributes to EU
policy objectives;
(c)
a description of additionality;
(d)
a description of the market failure or suboptimal investment situation;
(e)
the financial and technical contribution by the implementing partner;
(f)
the impact of the investment;
(g)
the financial profile of the financing or investment operation;
(h)
complementary indicators.
4.
The Commission is empowered to adopt delegated acts in accordance with
Article 34 in order to supplement this Regulation by establishing
additional elements of the Scoreboard, including detailed rules for the
Scoreboard to be used by the implementing partners.
Article 23
Policy check
1.
The Commission shall conduct a check to confirm that the financing and
investment operations proposed by the implementing partners other than
the EIB comply with Union law and policies.
2.
EIB financing and investment operations that fall within the scope of
this Regulation shall not be covered by the EU guarantee where the
Commission delivers an unfavourable opinion within the framework of the
procedure provided for in Article 19 of the EIB Statute.
▼M1
3.
In the context of the procedures referred to in paragraphs 1 and 2 of
this Article, the Commission shall take into account any Sovereignty
Seal awarded in accordance with Article 4 of Regulation (EU) 2024/795 to
a project.
▼B
Article 24
Investment Committee
1.
A fully independent investment committee shall be established for the
InvestEU Fund (the ‘Investment Committee’). The Investment Committee
shall:
(a)
examine the proposals for financing and investment operations submitted
by implementing partners for coverage under the EU guarantee that have
passed the policy check referred to in Article 23(1) of this Regulation
or that have received a favourable opinion within the framework of the
procedure provided for in Article 19 of the EIB Statute;
(b)
- >-
(6) | The maritime transport sector is subject to strong international
competition. Major differences in regulatory burdens across flag states
have often led to unwanted practices such as the reflagging of ships.
The sector’s intrinsic global character underlines the importance of a
flag-neutral approach and of a favourable regulatory environment, which
would help to attract new investment and safeguard the competitiveness
of Union ports, shipowners and ship operators.
- >-
8.
Where the scheme owner fails or refuses to take the necessary remedial
action and where the Commission has determined that the deficiencies
referred to in paragraph 6 of this Article mean that the scheme no
longer fulfils the criteria laid down in Annex IV, or of the recognised
subset of those criteria, the Commission shall withdraw the recognition
of the scheme by means of implementing acts. Those implementing acts
shall be adopted in accordance with the examination procedure referred
to in Article 39(3).
9.
- source_sentence: >-
What roles do upstream and downstream business partners play in the
overall production and distribution process as described?
sentences:
- >-
(25) The chain of activities should cover activities of a company’s
upstream business partners related to the production of goods or the
provision of services by the company, including the design, extraction,
sourcing, manufacture, transport, storage and supply of raw materials,
products or parts of the products and development of the product or the
service, and activities of a company’s downstream business partners
related to the distribution, transport and storage of the product, where
the business partners carry out those activities for the company or on
behalf of the company. This Directive should not cover the disposal of
the product. In addition, under this Directive the chain of activities
should not encompass the distribution,
- >-
7.
Any actor in the supply chain who is required to prepare a chemical
safety report according to Articles 14 or 37 shall place the relevant
exposure scenarios (including use and exposure categories where
appropriate) in an annex to the safety data sheet covering identified
uses and including specific conditions resulting from the application of
Section 3 of Annex XI.
Any downstream user shall include relevant exposure scenarios, and use
other relevant information, from the safety data sheet supplied to him
when compiling his own safety data sheet for identified uses.
- >-
8.
Authorisations shall be subject to a time-limited review without
prejudice to any decision on a future review period and shall normally
be subject to conditions, including monitoring. The duration of the
time-limited review for any authorisation shall be determined on a
case-by-case basis taking into account all relevant information
including the elements listed in paragraph 4(a) to (d), as appropriate.
9.
The authorisation shall specify:
(a)
the person(s) to whom the authorisation is granted;
(b)
the identity of the substance(s);
(c)
the use(s) for which the authorisation is granted;
(d)
any conditions under which the authorisation is granted;
(e)
the time-limited review period;
(f)
any monitoring arrangement.
10.
- source_sentence: >-
What conditions must be met for the stability study in organic solvents to
be deemed unnecessary for a substance?
sentences:
- >-
AR 23. When disclosing information required under paragraph 29 for the
purpose of setting targets the undertaking shall consider the need for
an informed and willing consent of local and indigenous peoples , the
need for appropriate consultations and the need to respect the decisions
of these communities.
AR 24. The targets related to material impacts may be presented in a
table as illustrated below:
Type of target according to mitigation hierarchy Baseline value and base
year Target value and geographical scope Connected policy or legislation
if relevant 2025 2030 Up to 2050 Avoidance Minimisation Rehabilitation
and restoration Compensation or offsets
- >-
1.
Member States shall, in accordance with paragraph 2, draw up a register
of producers, including producers supplying EEE by means of distance
communication. That register shall serve to monitor compliance with the
requirements of this Directive.
Producers supplying EEE by means of distance communication as defined in
Article 3(1)(f)(iv) shall be registered in the Member State that they
sell to. Where such producers are not registered in the Member State
that they are selling to, they shall be registered through their
authorised representatives as referred to in Article 17(2).
2.
Member States shall ensure that:
(a)
each producer, or each authorised representative where appointed under
Article 17, is registered as required and has the possibility of
entering online in their national register all relevant information
reflecting that producer’s activities in that Member State;
(b)
upon registering, each producer, or each authorised representative where
appointed under Article 17, provides the information set out in Annex X,
Part A, undertaking to update it as appropriate;
(c)
each producer, or each authorised representative where appointed under
Article 17, provides the information set out in Annex X, Part B;
(d)
national registers provide links to other national registers on their
website to facilitate, in all Member States, registration of producers
or, where appointed under Article 17, authorised representatives.
3.
In order to ensure uniform conditions for the implementation of this
Article, the Commission shall adopt implementing acts establishing the
format for registration and reporting and the frequency of reporting to
the register. Those implementing acts shall be adopted in accordance
with the examination procedure referred to in Article 21(2).
4.
Member States shall collect information, including substantiated
estimates, on an annual basis, on the quantities and categories of EEE
placed on their markets, collected through all routes, prepared for
re-use, recycled and recovered within the Member State, and on
separately collected WEEE exported, by weight.
▼M1 —————
▼M1
6.
- >-
COLUMN 1 STANDARD INFORMATION REQUIRED COLUMN 2 SPECIFIC RULES FOR
ADAPTATION FROM COLUMN 1 7.15. Stability in organic solvents and
identity of relevant degradation products Only required if stability of
the substance is considered to be critical. 7.15. The study does not
need to be conducted if the substance is inorganic. 7.16. Dissociation
constant 7.16. The study does not need to be conducted if: — the
substance is hydrolytically unstable (half-life less than 12 hours) or
is readily oxidisable in water, or ►M70 ◄ ►M64 — or based on the
structure, the substance does not have any chemical group that can
dissociate. ◄ 7.17. Viscosity ►M64 For hydrocarbon substances the
kinematic viscosity shall be determined at 40 °C. ◄
- source_sentence: >-
How is 'associated undertaking' defined, and what criteria determine the
significant influence of one undertaking over another in terms of voting
rights?
sentences:
- >-
▼B
(6)
‘purchase price’ means the price payable and any incidental expenses
minus any incidental reductions in the cost of acquisition;
(7)
‘production cost’ means the purchase price of raw materials, consumables
and other costs directly attributable to the item in question. Member
States shall permit or require the inclusion of a reasonable proportion
of fixed or variable overhead costs indirectly attributable to the item
in question, to the extent that they relate to the period of production.
Distribution costs shall not be included;
(8)
‘value adjustment’ means the adjustments intended to take account of
changes in the values of individual assets established at the balance
sheet date, whether the change is final or not;
(9)
‘parent undertaking’ means an undertaking which controls one or more
subsidiary undertakings;
(10)
‘subsidiary undertaking’ means an undertaking controlled by a parent
undertaking, including any subsidiary undertaking of an ultimate parent
undertaking;
(11)
‘group’ means a parent undertaking and all its subsidiary undertakings;
(12)
‘affiliated undertakings’ means any two or more undertakings within a
group;
(13)
‘associated undertaking’ means an undertaking in which another
undertaking has a participating interest, and over whose operating and
financial policies that other undertaking exercises significant
influence. An undertaking is presumed to exercise a significant
influence over another undertaking where it has 20 % or more of the
shareholders' or members' voting rights in that other undertaking;
(14)
‘investment undertakings’ means:
(a)
undertakings the sole object of which is to invest their funds in
various securities, real property and other assets, with the sole aim of
spreading investment risks and giving their shareholders the benefit of
the results of the management of their assets,
(b)
undertakings associated with investment undertakings with fixed capital,
if the sole object of those associated undertakings is to acquire fully
paid shares issued by those investment undertakings without prejudice to
point (h) of Article 22(1) of Directive 2012/30/EU;
(15)
- >-
and non-European non-financial corporations not subject to the
disclosure obligations laid down in Directive 2013/34/EU. That
information may be disclosed only once, based on counterparties’
turnover alignment for the general-purpose lending loans, as in the case
of the GAR. The first disclosure reference date of this template is as
of 31 December 2024. Institutions are not required to disclose this
information before 1 January 2025. ---|---|---
- >-
ANNEX II
Due diligence statement
Information to be contained in the due diligence statement in accordance
with Article 4(2):
1.
Operator’s name, address and, in the event of relevant commodities and
relevant products entering or leaving the market, the Economic Operators
Registration and Identification (EORI) number in accordance with Article
9 of Regulation (EU) No 952/2013.
2.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: SentenceTransformer based on Alibaba-NLP/gte-modernbert-base
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: Unknown
type: unknown
metrics:
- type: cosine_accuracy@1
value: 0.6910063870188158
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.9109269808389435
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.9461418953909891
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9742793026065941
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.6910063870188158
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.30364232694631454
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.18922837907819778
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09742793026065939
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.6910063870188158
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.9109269808389435
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.9461418953909891
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.9742793026065941
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.8471731447814336
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.804833419644399
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.8061197699360279
name: Cosine Map@100
SentenceTransformer based on Alibaba-NLP/gte-modernbert-base
This is a sentence-transformers model finetuned from Alibaba-NLP/gte-modernbert-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Alibaba-NLP/gte-modernbert-base
- Maximum Sequence Length: 8192 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
"How is 'associated undertaking' defined, and what criteria determine the significant influence of one undertaking over another in terms of voting rights?",
"▼B\n\n(6)\n\n‘purchase price’ means the price payable and any incidental expenses minus any incidental reductions in the cost of acquisition;\n\n(7)\n\n‘production cost’ means the purchase price of raw materials, consumables and other costs directly attributable to the item in question. Member States shall permit or require the inclusion of a reasonable proportion of fixed or variable overhead costs indirectly attributable to the item in question, to the extent that they relate to the period of production. Distribution costs shall not be included;\n\n(8)\n\n‘value adjustment’ means the adjustments intended to take account of changes in the values of individual assets established at the balance sheet date, whether the change is final or not;\n\n(9)\n\n‘parent undertaking’ means an undertaking which controls one or more subsidiary undertakings;\n\n(10)\n\n‘subsidiary undertaking’ means an undertaking controlled by a parent undertaking, including any subsidiary undertaking of an ultimate parent undertaking;\n\n(11)\n\n‘group’ means a parent undertaking and all its subsidiary undertakings;\n\n(12)\n\n‘affiliated undertakings’ means any two or more undertakings within a group;\n\n(13)\n\n‘associated undertaking’ means an undertaking in which another undertaking has a participating interest, and over whose operating and financial policies that other undertaking exercises significant influence. An undertaking is presumed to exercise a significant influence over another undertaking where it has 20 % or more of the shareholders' or members' voting rights in that other undertaking;\n\n(14)\n\n‘investment undertakings’ means:\n\n(a)\n\nundertakings the sole object of which is to invest their funds in various securities, real property and other assets, with the sole aim of spreading investment risks and giving their shareholders the benefit of the results of the management of their assets,\n\n(b)\n\nundertakings associated with investment undertakings with fixed capital, if the sole object of those associated undertakings is to acquire fully paid shares issued by those investment undertakings without prejudice to point (h) of Article 22(1) of Directive 2012/30/EU;\n\n(15)",
'and non-European non-financial corporations not subject to the disclosure obligations laid down in Directive 2013/34/EU. That information may be disclosed only once, based on counterparties’ turnover alignment for the general-purpose lending loans, as in the case of the GAR. The first disclosure reference date of this template is as of 31 December 2024. Institutions are not required to disclose this information before 1 January 2025. ---|---|---',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.691 |
cosine_accuracy@3 | 0.9109 |
cosine_accuracy@5 | 0.9461 |
cosine_accuracy@10 | 0.9743 |
cosine_precision@1 | 0.691 |
cosine_precision@3 | 0.3036 |
cosine_precision@5 | 0.1892 |
cosine_precision@10 | 0.0974 |
cosine_recall@1 | 0.691 |
cosine_recall@3 | 0.9109 |
cosine_recall@5 | 0.9461 |
cosine_recall@10 | 0.9743 |
cosine_ndcg@10 | 0.8472 |
cosine_mrr@10 | 0.8048 |
cosine_map@100 | 0.8061 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 46,338 training samples
- Columns:
sentence_0
andsentence_1
- Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 type string string details - min: 13 tokens
- mean: 34.18 tokens
- max: 251 tokens
- min: 7 tokens
- mean: 231.33 tokens
- max: 2146 tokens
- Samples:
sentence_0 sentence_1 How is 'energy efficiency' defined in the context of Directive (EU) 2018/2001?
of Directive (EU) 2018/2001; --- --- (8) ‘energy efficiency’ means the ratio of output of performance, service, goods or energy to input of energy; --- --- (9) ‘energy savings’ means an amount of saved energy determined by measuring or estimating consumption, or both,, before and after the implementation of an energy efficiency improvement measure, whilst ensuring normalisation for external conditions that affect energy consumption; --- --- (10) ‘energy efficiency improvement’ means an increase in energy efficiency as a result of any technological, behavioural or economic changes; --- --- (11) ‘energy service’ means the physical benefit, utility or good derived from a combination of energy with energy-efficient technology or with action,
What are the sources of information that the external experts will use to create the list of conflict-affected and high-risk areas?
2.
The Commission shall call upon external expertise that will provide an indicative, non-exhaustive, regularly updated list of conflict-affected and high-risk areas. That list shall be based on the external experts' analysis of the handbook referred to in paragraph 1 and existing information from, inter alia, academics and supply chain due diligence schemes. Union importers sourcing from areas which are not mentioned on that list shall also maintain their responsibility to comply with the due diligence obligations under this Regulation.
Article 15
Committee procedure
1.
The Commission shall be assisted by a committee. That committee shall be a committee within the meaning of Regulation (EU) No 182/2011.
2.What is the maximum time frame for completing the undertaking according to the technical specifications set out in Annexes II and III after the Directive enters into force?
is undertaken according to the technical specifications set out in Annexes II and III and that it is completed at the latest four years after the date of entry into force of this Directive.
2. The analyses and reviews mentioned under paragraph 1 shall be reviewed, and if necessary updated at the latest 13 years after the date of entry into force of this Directive and every six years thereafter.
Article 6
Register of protected areas - Loss:
MatryoshkaLoss
with these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 4per_device_eval_batch_size
: 4num_train_epochs
: 4multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 4per_device_eval_batch_size
: 4per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 4max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Epoch | Step | Training Loss | cosine_ndcg@10 |
---|---|---|---|
0.0432 | 500 | 0.358 | - |
0.0863 | 1000 | 0.1048 | - |
0.1295 | 1500 | 0.0827 | - |
0.1726 | 2000 | 0.067 | 0.7969 |
0.2158 | 2500 | 0.0491 | - |
0.2590 | 3000 | 0.0831 | - |
0.3021 | 3500 | 0.062 | - |
0.3453 | 4000 | 0.0657 | 0.8050 |
0.3884 | 4500 | 0.0522 | - |
0.4316 | 5000 | 0.049 | - |
0.4748 | 5500 | 0.0426 | - |
0.5179 | 6000 | 0.0708 | 0.8215 |
0.5611 | 6500 | 0.0236 | - |
0.6042 | 7000 | 0.024 | - |
0.6474 | 7500 | 0.0256 | - |
0.6905 | 8000 | 0.041 | 0.8105 |
0.7337 | 8500 | 0.0285 | - |
0.7769 | 9000 | 0.0249 | - |
0.8200 | 9500 | 0.0368 | - |
0.8632 | 10000 | 0.0588 | 0.8118 |
0.9063 | 10500 | 0.0386 | - |
0.9495 | 11000 | 0.0456 | - |
0.9927 | 11500 | 0.0399 | - |
1.0 | 11585 | - | 0.8184 |
1.0358 | 12000 | 0.0424 | 0.8239 |
1.0790 | 12500 | 0.0107 | - |
1.1221 | 13000 | 0.0279 | - |
1.1653 | 13500 | 0.0236 | - |
1.2085 | 14000 | 0.024 | 0.8193 |
1.2516 | 14500 | 0.0143 | - |
1.2948 | 15000 | 0.0118 | - |
1.3379 | 15500 | 0.0078 | - |
1.3811 | 16000 | 0.023 | 0.8217 |
1.4243 | 16500 | 0.0239 | - |
1.4674 | 17000 | 0.0335 | - |
1.5106 | 17500 | 0.0119 | - |
1.5537 | 18000 | 0.0411 | 0.8292 |
1.5969 | 18500 | 0.0168 | - |
1.6401 | 19000 | 0.0059 | - |
1.6832 | 19500 | 0.0234 | - |
1.7264 | 20000 | 0.0184 | 0.8366 |
1.7695 | 20500 | 0.0128 | - |
1.8127 | 21000 | 0.0166 | - |
1.8558 | 21500 | 0.0181 | - |
1.8990 | 22000 | 0.0148 | 0.8353 |
1.9422 | 22500 | 0.0225 | - |
1.9853 | 23000 | 0.0158 | - |
2.0 | 23170 | - | 0.8360 |
2.0285 | 23500 | 0.0123 | - |
2.0716 | 24000 | 0.0173 | 0.8329 |
2.1148 | 24500 | 0.0167 | - |
2.1580 | 25000 | 0.0125 | - |
2.2011 | 25500 | 0.013 | - |
2.2443 | 26000 | 0.0079 | 0.8338 |
2.2874 | 26500 | 0.007 | - |
2.3306 | 27000 | 0.0171 | - |
2.3738 | 27500 | 0.0058 | - |
2.4169 | 28000 | 0.0048 | 0.8405 |
2.4601 | 28500 | 0.005 | - |
2.5032 | 29000 | 0.0141 | - |
2.5464 | 29500 | 0.0132 | - |
2.5896 | 30000 | 0.006 | 0.8461 |
2.6327 | 30500 | 0.0095 | - |
2.6759 | 31000 | 0.0061 | - |
2.7190 | 31500 | 0.0107 | - |
2.7622 | 32000 | 0.0157 | 0.8451 |
2.8054 | 32500 | 0.005 | - |
2.8485 | 33000 | 0.0087 | - |
2.8917 | 33500 | 0.0064 | - |
2.9348 | 34000 | 0.005 | 0.8449 |
2.9780 | 34500 | 0.0115 | - |
3.0 | 34755 | - | 0.8451 |
3.0211 | 35000 | 0.0079 | - |
3.0643 | 35500 | 0.0045 | - |
3.1075 | 36000 | 0.0029 | 0.8443 |
3.1506 | 36500 | 0.0161 | - |
3.1938 | 37000 | 0.0144 | - |
3.2369 | 37500 | 0.0076 | - |
3.2801 | 38000 | 0.0157 | 0.8500 |
3.3233 | 38500 | 0.0039 | - |
3.3664 | 39000 | 0.0045 | - |
3.4096 | 39500 | 0.0033 | - |
3.4527 | 40000 | 0.0064 | 0.8434 |
3.4959 | 40500 | 0.0054 | - |
3.5391 | 41000 | 0.0061 | - |
3.5822 | 41500 | 0.0051 | - |
3.6254 | 42000 | 0.0019 | 0.8472 |
Framework Versions
- Python: 3.10.15
- Sentence Transformers: 3.4.1
- Transformers: 4.49.0
- PyTorch: 2.6.0+cu126
- Accelerate: 1.5.2
- Datasets: 3.4.1
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}